Assignment 01 Transfer Learning
Assignment 02 Data Pipeline with Keras and tf.data
Assignment 03 Language Model For The Shakespeare Dataset
Assignment 04 Residual Network
In this notebook, you will create a neural network model to classify images of cats and dogs, using transfer learning: you will use part of a pre-trained image classifier model (trained on ImageNet) as a feature extractor, and train additional new layers to perform the cats and dogs classification task.
Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line:
#### GRADED CELL ####
Don't move or edit this first line - this is what the automatic grader looks for to recognise graded cells. These cells require you to write your own code to complete them, and are automatically graded when you submit the notebook. Don't edit the function name or signature provided in these cells, otherwise the automatic grader might not function properly. Inside these graded cells, you can use any functions or classes that are imported below, but make sure you don't use any variables that are outside the scope of the function.
Complete all the tasks you are asked for in the worksheet. When you have finished and are happy with your code, press the Submit Assignment button at the top of this notebook.
We'll start running some imports, and loading the dataset. Do not edit the existing imports in the following cell. If you would like to make further Tensorflow imports, you should add them here.
#### PACKAGE IMPORTS ####
# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
import numpy as np
import os
import pandas as pd
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
# If you would like to make further imports from Tensorflow, add them here
from tensorflow.keras.layers import Input, Conv2D, MaxPool2D, Flatten, Dense, Dropout
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.optimizers import RMSprop

In this assignment, you will use the Dogs vs Cats dataset, which was used for a 2013 Kaggle competition. It consists of 25000 images containing either a cat or a dog. We will only use a subset of 600 images and labels. The dataset is a subset of a much larger dataset of 3 million photos that were originally used as a CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart), referred to as “Asirra” or Animal Species Image Recognition for Restricting Access.
Your goal is to train a classifier model using part of a pre-trained image classifier, using the principle of transfer learning.
images_train = np.load('data/images_train.npy') / 255.
images_valid = np.load('data/images_valid.npy') / 255.
images_test = np.load('data/images_test.npy') / 255.
labels_train = np.load('data/labels_train.npy')
labels_valid = np.load('data/labels_valid.npy')
labels_test = np.load('data/labels_test.npy')
print("{} training data examples".format(images_train.shape[0]))
print("{} validation data examples".format(images_valid.shape[0]))
print("{} test data examples".format(images_test.shape[0]))
600 training data examples 300 validation data examples 300 test data examples
# Display a few images and labels
class_names = np.array(['Dog', 'Cat'])
plt.figure(figsize=(15,10))
inx = np.random.choice(images_train.shape[0], 15, replace = False)
for n, i in enumerate(inx):
ax = plt.subplot(3,5,n + 1)
plt.imshow(images_train[i])
plt.title(class_names[labels_train[i]])
plt.axis('off')
We will first train a CNN classifier model as a benchmark model before implementing the transfer learning approach. Using the functional API, build the benchmark model according to the following specifications:
input_shape in the function argument to set the shape in the Input layer.'SAME' padding.In total, the network should have 13 layers (including the Input layer).
The model should then be compiled with the RMSProp optimiser with learning rate 0.001, binary cross entropy loss and and binary accuracy metric.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def get_benchmark_model(input_shape):
"""
This function should build and compile a CNN model according to the above specification,
using the functional API. The function takes input_shape as an argument, which should be
used to specify the shape in the Input layer.
Your function should return the model.
"""
inputs = Input(shape = input_shape)
h = Conv2D(filters = 32, kernel_size = (3, 3), activation = 'relu', padding = 'same')(inputs)
h = Conv2D(filters = 32, kernel_size = (3, 3), activation = 'relu', padding = 'same')(h)
h = MaxPool2D(pool_size = (2, 2))(h)
h = Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu', padding = 'same')(h)
h = Conv2D(filters = 64, kernel_size = (3, 3), activation = 'relu', padding = 'same')(h)
h = MaxPool2D(pool_size = (2, 2))(h)
h = Conv2D(filters = 128, kernel_size = (3, 3), activation = 'relu', padding = 'same')(h)
h = Conv2D(filters = 128, kernel_size = (3, 3), activation = 'relu', padding = 'same')(h)
h = MaxPool2D(pool_size = (2, 2))(h)
h = Flatten()(h)
h = Dense(units = 128, activation = 'relu')(h)
outputs = Dense(units = 1, activation = 'sigmoid')(h)
model = Model(inputs = inputs, outputs = outputs)
model.compile (
optimizer = RMSprop(learning_rate = 1e-3),
loss = 'binary_crossentropy',
metrics = ['accuracy']
)
return model
# Build and compile the benchmark model, and display the model summary
benchmark_model = get_benchmark_model(images_train[0].shape)
benchmark_model.summary()
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 160, 160, 3)] 0 _________________________________________________________________ conv2d (Conv2D) (None, 160, 160, 32) 896 _________________________________________________________________ conv2d_1 (Conv2D) (None, 160, 160, 32) 9248 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 80, 80, 32) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 80, 80, 64) 18496 _________________________________________________________________ conv2d_3 (Conv2D) (None, 80, 80, 64) 36928 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 40, 40, 64) 0 _________________________________________________________________ conv2d_4 (Conv2D) (None, 40, 40, 128) 73856 _________________________________________________________________ conv2d_5 (Conv2D) (None, 40, 40, 128) 147584 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 20, 20, 128) 0 _________________________________________________________________ flatten (Flatten) (None, 51200) 0 _________________________________________________________________ dense (Dense) (None, 128) 6553728 _________________________________________________________________ dense_1 (Dense) (None, 1) 129 ================================================================= Total params: 6,840,865 Trainable params: 6,840,865 Non-trainable params: 0 _________________________________________________________________
We will train the benchmark CNN model using an EarlyStopping callback. Feel free to increase the training time if you wish.
# Fit the benchmark model and save its training history
earlystopping = tf.keras.callbacks.EarlyStopping(patience = 2)
history_benchmark = benchmark_model.fit (
images_train, labels_train, epochs = 10, batch_size = 32,
validation_data = (images_valid, labels_valid), callbacks = [earlystopping]
)
Epoch 1/10 19/19 [==============================] - 3s 157ms/step - loss: 0.7449 - accuracy: 0.4983 - val_loss: 0.6924 - val_accuracy: 0.5100 Epoch 2/10 19/19 [==============================] - 1s 78ms/step - loss: 0.6937 - accuracy: 0.5200 - val_loss: 0.6933 - val_accuracy: 0.5000 Epoch 3/10 19/19 [==============================] - 1s 73ms/step - loss: 0.6908 - accuracy: 0.5267 - val_loss: 0.6982 - val_accuracy: 0.5000
# Run this cell to plot accuracy vs epoch and loss vs epoch
plt.figure(figsize = (15,5))
plt.subplot(121)
try:
plt.plot(history_benchmark.history['accuracy'])
plt.plot(history_benchmark.history['val_accuracy'])
except KeyError:
plt.plot(history_benchmark.history['acc'])
plt.plot(history_benchmark.history['val_acc'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'lower right')
plt.subplot(122)
plt.plot(history_benchmark.history['loss'])
plt.plot(history_benchmark.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'upper right')
plt.show()
# Evaluate the benchmark model on the test set
benchmark_test_loss, benchmark_test_acc = benchmark_model.evaluate(images_test, labels_test, verbose=0)
print("Test loss: {}".format(benchmark_test_loss))
print("Test accuracy: {}".format(benchmark_test_acc))
Test loss: 0.6976297497749329 Test accuracy: 0.503333330154419
You will now begin to build our image classifier using transfer learning.
You will use the pre-trained MobileNet V2 model, available to download from Keras Applications. However, we have already downloaded the pretrained model for you, and it is available at the location ./models/MobileNetV2.h5.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def load_pretrained_MobileNetV2(path):
"""
This function takes a path as an argument, and uses it to
load the full MobileNetV2 pretrained model from the path.
Your function should return the loaded model.
"""
return load_model(path)
# Call the function loading the pretrained model and display its summary
base_model = load_pretrained_MobileNetV2('models/MobileNetV2.h5')
base_model.summary()
WARNING:tensorflow:No training configuration found in the save file, so the model was *not* compiled. Compile it manually.
Model: "mobilenetv2_1.00_160"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_6 (InputLayer) [(None, 160, 160, 3) 0
__________________________________________________________________________________________________
Conv1_pad (ZeroPadding2D) (None, 161, 161, 3) 0 input_6[0][0]
__________________________________________________________________________________________________
Conv1 (Conv2D) (None, 80, 80, 32) 864 Conv1_pad[0][0]
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalization) (None, 80, 80, 32) 128 Conv1[0][0]
__________________________________________________________________________________________________
Conv1_relu (ReLU) (None, 80, 80, 32) 0 bn_Conv1[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise (Depthw (None, 80, 80, 32) 288 Conv1_relu[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_BN (Bat (None, 80, 80, 32) 128 expanded_conv_depthwise[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_relu (R (None, 80, 80, 32) 0 expanded_conv_depthwise_BN[0][0]
__________________________________________________________________________________________________
expanded_conv_project (Conv2D) (None, 80, 80, 16) 512 expanded_conv_depthwise_relu[0][0
__________________________________________________________________________________________________
expanded_conv_project_BN (Batch (None, 80, 80, 16) 64 expanded_conv_project[0][0]
__________________________________________________________________________________________________
block_1_expand (Conv2D) (None, 80, 80, 96) 1536 expanded_conv_project_BN[0][0]
__________________________________________________________________________________________________
block_1_expand_BN (BatchNormali (None, 80, 80, 96) 384 block_1_expand[0][0]
__________________________________________________________________________________________________
block_1_expand_relu (ReLU) (None, 80, 80, 96) 0 block_1_expand_BN[0][0]
__________________________________________________________________________________________________
block_1_pad (ZeroPadding2D) (None, 81, 81, 96) 0 block_1_expand_relu[0][0]
__________________________________________________________________________________________________
block_1_depthwise (DepthwiseCon (None, 40, 40, 96) 864 block_1_pad[0][0]
__________________________________________________________________________________________________
block_1_depthwise_BN (BatchNorm (None, 40, 40, 96) 384 block_1_depthwise[0][0]
__________________________________________________________________________________________________
block_1_depthwise_relu (ReLU) (None, 40, 40, 96) 0 block_1_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_1_project (Conv2D) (None, 40, 40, 24) 2304 block_1_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_1_project_BN (BatchNormal (None, 40, 40, 24) 96 block_1_project[0][0]
__________________________________________________________________________________________________
block_2_expand (Conv2D) (None, 40, 40, 144) 3456 block_1_project_BN[0][0]
__________________________________________________________________________________________________
block_2_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_2_expand[0][0]
__________________________________________________________________________________________________
block_2_expand_relu (ReLU) (None, 40, 40, 144) 0 block_2_expand_BN[0][0]
__________________________________________________________________________________________________
block_2_depthwise (DepthwiseCon (None, 40, 40, 144) 1296 block_2_expand_relu[0][0]
__________________________________________________________________________________________________
block_2_depthwise_BN (BatchNorm (None, 40, 40, 144) 576 block_2_depthwise[0][0]
__________________________________________________________________________________________________
block_2_depthwise_relu (ReLU) (None, 40, 40, 144) 0 block_2_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_2_project (Conv2D) (None, 40, 40, 24) 3456 block_2_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_2_project_BN (BatchNormal (None, 40, 40, 24) 96 block_2_project[0][0]
__________________________________________________________________________________________________
block_2_add (Add) (None, 40, 40, 24) 0 block_1_project_BN[0][0]
block_2_project_BN[0][0]
__________________________________________________________________________________________________
block_3_expand (Conv2D) (None, 40, 40, 144) 3456 block_2_add[0][0]
__________________________________________________________________________________________________
block_3_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_3_expand[0][0]
__________________________________________________________________________________________________
block_3_expand_relu (ReLU) (None, 40, 40, 144) 0 block_3_expand_BN[0][0]
__________________________________________________________________________________________________
block_3_pad (ZeroPadding2D) (None, 41, 41, 144) 0 block_3_expand_relu[0][0]
__________________________________________________________________________________________________
block_3_depthwise (DepthwiseCon (None, 20, 20, 144) 1296 block_3_pad[0][0]
__________________________________________________________________________________________________
block_3_depthwise_BN (BatchNorm (None, 20, 20, 144) 576 block_3_depthwise[0][0]
__________________________________________________________________________________________________
block_3_depthwise_relu (ReLU) (None, 20, 20, 144) 0 block_3_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_3_project (Conv2D) (None, 20, 20, 32) 4608 block_3_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_3_project_BN (BatchNormal (None, 20, 20, 32) 128 block_3_project[0][0]
__________________________________________________________________________________________________
block_4_expand (Conv2D) (None, 20, 20, 192) 6144 block_3_project_BN[0][0]
__________________________________________________________________________________________________
block_4_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_4_expand[0][0]
__________________________________________________________________________________________________
block_4_expand_relu (ReLU) (None, 20, 20, 192) 0 block_4_expand_BN[0][0]
__________________________________________________________________________________________________
block_4_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_4_expand_relu[0][0]
__________________________________________________________________________________________________
block_4_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_4_depthwise[0][0]
__________________________________________________________________________________________________
block_4_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_4_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_4_project (Conv2D) (None, 20, 20, 32) 6144 block_4_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_4_project_BN (BatchNormal (None, 20, 20, 32) 128 block_4_project[0][0]
__________________________________________________________________________________________________
block_4_add (Add) (None, 20, 20, 32) 0 block_3_project_BN[0][0]
block_4_project_BN[0][0]
__________________________________________________________________________________________________
block_5_expand (Conv2D) (None, 20, 20, 192) 6144 block_4_add[0][0]
__________________________________________________________________________________________________
block_5_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_5_expand[0][0]
__________________________________________________________________________________________________
block_5_expand_relu (ReLU) (None, 20, 20, 192) 0 block_5_expand_BN[0][0]
__________________________________________________________________________________________________
block_5_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_5_expand_relu[0][0]
__________________________________________________________________________________________________
block_5_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_5_depthwise[0][0]
__________________________________________________________________________________________________
block_5_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_5_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_5_project (Conv2D) (None, 20, 20, 32) 6144 block_5_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_5_project_BN (BatchNormal (None, 20, 20, 32) 128 block_5_project[0][0]
__________________________________________________________________________________________________
block_5_add (Add) (None, 20, 20, 32) 0 block_4_add[0][0]
block_5_project_BN[0][0]
__________________________________________________________________________________________________
block_6_expand (Conv2D) (None, 20, 20, 192) 6144 block_5_add[0][0]
__________________________________________________________________________________________________
block_6_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_6_expand[0][0]
__________________________________________________________________________________________________
block_6_expand_relu (ReLU) (None, 20, 20, 192) 0 block_6_expand_BN[0][0]
__________________________________________________________________________________________________
block_6_pad (ZeroPadding2D) (None, 21, 21, 192) 0 block_6_expand_relu[0][0]
__________________________________________________________________________________________________
block_6_depthwise (DepthwiseCon (None, 10, 10, 192) 1728 block_6_pad[0][0]
__________________________________________________________________________________________________
block_6_depthwise_BN (BatchNorm (None, 10, 10, 192) 768 block_6_depthwise[0][0]
__________________________________________________________________________________________________
block_6_depthwise_relu (ReLU) (None, 10, 10, 192) 0 block_6_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_6_project (Conv2D) (None, 10, 10, 64) 12288 block_6_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_6_project_BN (BatchNormal (None, 10, 10, 64) 256 block_6_project[0][0]
__________________________________________________________________________________________________
block_7_expand (Conv2D) (None, 10, 10, 384) 24576 block_6_project_BN[0][0]
__________________________________________________________________________________________________
block_7_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_7_expand[0][0]
__________________________________________________________________________________________________
block_7_expand_relu (ReLU) (None, 10, 10, 384) 0 block_7_expand_BN[0][0]
__________________________________________________________________________________________________
block_7_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_7_expand_relu[0][0]
__________________________________________________________________________________________________
block_7_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_7_depthwise[0][0]
__________________________________________________________________________________________________
block_7_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_7_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_7_project (Conv2D) (None, 10, 10, 64) 24576 block_7_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_7_project_BN (BatchNormal (None, 10, 10, 64) 256 block_7_project[0][0]
__________________________________________________________________________________________________
block_7_add (Add) (None, 10, 10, 64) 0 block_6_project_BN[0][0]
block_7_project_BN[0][0]
__________________________________________________________________________________________________
block_8_expand (Conv2D) (None, 10, 10, 384) 24576 block_7_add[0][0]
__________________________________________________________________________________________________
block_8_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_8_expand[0][0]
__________________________________________________________________________________________________
block_8_expand_relu (ReLU) (None, 10, 10, 384) 0 block_8_expand_BN[0][0]
__________________________________________________________________________________________________
block_8_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_8_expand_relu[0][0]
__________________________________________________________________________________________________
block_8_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_8_depthwise[0][0]
__________________________________________________________________________________________________
block_8_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_8_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_8_project (Conv2D) (None, 10, 10, 64) 24576 block_8_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_8_project_BN (BatchNormal (None, 10, 10, 64) 256 block_8_project[0][0]
__________________________________________________________________________________________________
block_8_add (Add) (None, 10, 10, 64) 0 block_7_add[0][0]
block_8_project_BN[0][0]
__________________________________________________________________________________________________
block_9_expand (Conv2D) (None, 10, 10, 384) 24576 block_8_add[0][0]
__________________________________________________________________________________________________
block_9_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_9_expand[0][0]
__________________________________________________________________________________________________
block_9_expand_relu (ReLU) (None, 10, 10, 384) 0 block_9_expand_BN[0][0]
__________________________________________________________________________________________________
block_9_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_9_expand_relu[0][0]
__________________________________________________________________________________________________
block_9_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_9_depthwise[0][0]
__________________________________________________________________________________________________
block_9_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_9_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_9_project (Conv2D) (None, 10, 10, 64) 24576 block_9_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_9_project_BN (BatchNormal (None, 10, 10, 64) 256 block_9_project[0][0]
__________________________________________________________________________________________________
block_9_add (Add) (None, 10, 10, 64) 0 block_8_add[0][0]
block_9_project_BN[0][0]
__________________________________________________________________________________________________
block_10_expand (Conv2D) (None, 10, 10, 384) 24576 block_9_add[0][0]
__________________________________________________________________________________________________
block_10_expand_BN (BatchNormal (None, 10, 10, 384) 1536 block_10_expand[0][0]
__________________________________________________________________________________________________
block_10_expand_relu (ReLU) (None, 10, 10, 384) 0 block_10_expand_BN[0][0]
__________________________________________________________________________________________________
block_10_depthwise (DepthwiseCo (None, 10, 10, 384) 3456 block_10_expand_relu[0][0]
__________________________________________________________________________________________________
block_10_depthwise_BN (BatchNor (None, 10, 10, 384) 1536 block_10_depthwise[0][0]
__________________________________________________________________________________________________
block_10_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_10_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_10_project (Conv2D) (None, 10, 10, 96) 36864 block_10_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_10_project_BN (BatchNorma (None, 10, 10, 96) 384 block_10_project[0][0]
__________________________________________________________________________________________________
block_11_expand (Conv2D) (None, 10, 10, 576) 55296 block_10_project_BN[0][0]
__________________________________________________________________________________________________
block_11_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_11_expand[0][0]
__________________________________________________________________________________________________
block_11_expand_relu (ReLU) (None, 10, 10, 576) 0 block_11_expand_BN[0][0]
__________________________________________________________________________________________________
block_11_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_11_expand_relu[0][0]
__________________________________________________________________________________________________
block_11_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_11_depthwise[0][0]
__________________________________________________________________________________________________
block_11_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_11_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_11_project (Conv2D) (None, 10, 10, 96) 55296 block_11_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_11_project_BN (BatchNorma (None, 10, 10, 96) 384 block_11_project[0][0]
__________________________________________________________________________________________________
block_11_add (Add) (None, 10, 10, 96) 0 block_10_project_BN[0][0]
block_11_project_BN[0][0]
__________________________________________________________________________________________________
block_12_expand (Conv2D) (None, 10, 10, 576) 55296 block_11_add[0][0]
__________________________________________________________________________________________________
block_12_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_12_expand[0][0]
__________________________________________________________________________________________________
block_12_expand_relu (ReLU) (None, 10, 10, 576) 0 block_12_expand_BN[0][0]
__________________________________________________________________________________________________
block_12_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_12_expand_relu[0][0]
__________________________________________________________________________________________________
block_12_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_12_depthwise[0][0]
__________________________________________________________________________________________________
block_12_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_12_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_12_project (Conv2D) (None, 10, 10, 96) 55296 block_12_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_12_project_BN (BatchNorma (None, 10, 10, 96) 384 block_12_project[0][0]
__________________________________________________________________________________________________
block_12_add (Add) (None, 10, 10, 96) 0 block_11_add[0][0]
block_12_project_BN[0][0]
__________________________________________________________________________________________________
block_13_expand (Conv2D) (None, 10, 10, 576) 55296 block_12_add[0][0]
__________________________________________________________________________________________________
block_13_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_13_expand[0][0]
__________________________________________________________________________________________________
block_13_expand_relu (ReLU) (None, 10, 10, 576) 0 block_13_expand_BN[0][0]
__________________________________________________________________________________________________
block_13_pad (ZeroPadding2D) (None, 11, 11, 576) 0 block_13_expand_relu[0][0]
__________________________________________________________________________________________________
block_13_depthwise (DepthwiseCo (None, 5, 5, 576) 5184 block_13_pad[0][0]
__________________________________________________________________________________________________
block_13_depthwise_BN (BatchNor (None, 5, 5, 576) 2304 block_13_depthwise[0][0]
__________________________________________________________________________________________________
block_13_depthwise_relu (ReLU) (None, 5, 5, 576) 0 block_13_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_13_project (Conv2D) (None, 5, 5, 160) 92160 block_13_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_13_project_BN (BatchNorma (None, 5, 5, 160) 640 block_13_project[0][0]
__________________________________________________________________________________________________
block_14_expand (Conv2D) (None, 5, 5, 960) 153600 block_13_project_BN[0][0]
__________________________________________________________________________________________________
block_14_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_14_expand[0][0]
__________________________________________________________________________________________________
block_14_expand_relu (ReLU) (None, 5, 5, 960) 0 block_14_expand_BN[0][0]
__________________________________________________________________________________________________
block_14_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_14_expand_relu[0][0]
__________________________________________________________________________________________________
block_14_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_14_depthwise[0][0]
__________________________________________________________________________________________________
block_14_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_14_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_14_project (Conv2D) (None, 5, 5, 160) 153600 block_14_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_14_project_BN (BatchNorma (None, 5, 5, 160) 640 block_14_project[0][0]
__________________________________________________________________________________________________
block_14_add (Add) (None, 5, 5, 160) 0 block_13_project_BN[0][0]
block_14_project_BN[0][0]
__________________________________________________________________________________________________
block_15_expand (Conv2D) (None, 5, 5, 960) 153600 block_14_add[0][0]
__________________________________________________________________________________________________
block_15_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_15_expand[0][0]
__________________________________________________________________________________________________
block_15_expand_relu (ReLU) (None, 5, 5, 960) 0 block_15_expand_BN[0][0]
__________________________________________________________________________________________________
block_15_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_15_expand_relu[0][0]
__________________________________________________________________________________________________
block_15_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_15_depthwise[0][0]
__________________________________________________________________________________________________
block_15_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_15_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_15_project (Conv2D) (None, 5, 5, 160) 153600 block_15_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_15_project_BN (BatchNorma (None, 5, 5, 160) 640 block_15_project[0][0]
__________________________________________________________________________________________________
block_15_add (Add) (None, 5, 5, 160) 0 block_14_add[0][0]
block_15_project_BN[0][0]
__________________________________________________________________________________________________
block_16_expand (Conv2D) (None, 5, 5, 960) 153600 block_15_add[0][0]
__________________________________________________________________________________________________
block_16_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_16_expand[0][0]
__________________________________________________________________________________________________
block_16_expand_relu (ReLU) (None, 5, 5, 960) 0 block_16_expand_BN[0][0]
__________________________________________________________________________________________________
block_16_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_16_expand_relu[0][0]
__________________________________________________________________________________________________
block_16_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_16_depthwise[0][0]
__________________________________________________________________________________________________
block_16_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_16_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_16_project (Conv2D) (None, 5, 5, 320) 307200 block_16_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_16_project_BN (BatchNorma (None, 5, 5, 320) 1280 block_16_project[0][0]
__________________________________________________________________________________________________
Conv_1 (Conv2D) (None, 5, 5, 1280) 409600 block_16_project_BN[0][0]
__________________________________________________________________________________________________
Conv_1_bn (BatchNormalization) (None, 5, 5, 1280) 5120 Conv_1[0][0]
__________________________________________________________________________________________________
out_relu (ReLU) (None, 5, 5, 1280) 0 Conv_1_bn[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_6 (Glo (None, 1280) 0 out_relu[0][0]
__________________________________________________________________________________________________
Logits (Dense) (None, 1000) 1281000 global_average_pooling2d_6[0][0]
==================================================================================================
Total params: 3,538,984
Trainable params: 3,504,872
Non-trainable params: 34,112
__________________________________________________________________________________________________
You will remove the final layer of the network and replace it with new, untrained classifier layers for our task. You will first create a new model that has the same input tensor as the MobileNetV2 model, and uses the output tensor from the layer with name global_average_pooling2d_6 as the model output.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def remove_head(pretrained_model):
"""
This function should create and return a new model, using the input and output
tensors as specified above.
Use the 'get_layer' method to access the correct layer of the pre-trained model.
"""
inputs = pretrained_model.inputs
outputs = pretrained_model.get_layer('global_average_pooling2d_6').output
return Model(inputs = inputs, outputs = outputs)
# Call the function removing the classification head and display the summary
feature_extractor = remove_head(base_model)
feature_extractor.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_6 (InputLayer) [(None, 160, 160, 3) 0
__________________________________________________________________________________________________
Conv1_pad (ZeroPadding2D) (None, 161, 161, 3) 0 input_6[0][0]
__________________________________________________________________________________________________
Conv1 (Conv2D) (None, 80, 80, 32) 864 Conv1_pad[0][0]
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalization) (None, 80, 80, 32) 128 Conv1[0][0]
__________________________________________________________________________________________________
Conv1_relu (ReLU) (None, 80, 80, 32) 0 bn_Conv1[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise (Depthw (None, 80, 80, 32) 288 Conv1_relu[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_BN (Bat (None, 80, 80, 32) 128 expanded_conv_depthwise[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_relu (R (None, 80, 80, 32) 0 expanded_conv_depthwise_BN[0][0]
__________________________________________________________________________________________________
expanded_conv_project (Conv2D) (None, 80, 80, 16) 512 expanded_conv_depthwise_relu[0][0
__________________________________________________________________________________________________
expanded_conv_project_BN (Batch (None, 80, 80, 16) 64 expanded_conv_project[0][0]
__________________________________________________________________________________________________
block_1_expand (Conv2D) (None, 80, 80, 96) 1536 expanded_conv_project_BN[0][0]
__________________________________________________________________________________________________
block_1_expand_BN (BatchNormali (None, 80, 80, 96) 384 block_1_expand[0][0]
__________________________________________________________________________________________________
block_1_expand_relu (ReLU) (None, 80, 80, 96) 0 block_1_expand_BN[0][0]
__________________________________________________________________________________________________
block_1_pad (ZeroPadding2D) (None, 81, 81, 96) 0 block_1_expand_relu[0][0]
__________________________________________________________________________________________________
block_1_depthwise (DepthwiseCon (None, 40, 40, 96) 864 block_1_pad[0][0]
__________________________________________________________________________________________________
block_1_depthwise_BN (BatchNorm (None, 40, 40, 96) 384 block_1_depthwise[0][0]
__________________________________________________________________________________________________
block_1_depthwise_relu (ReLU) (None, 40, 40, 96) 0 block_1_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_1_project (Conv2D) (None, 40, 40, 24) 2304 block_1_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_1_project_BN (BatchNormal (None, 40, 40, 24) 96 block_1_project[0][0]
__________________________________________________________________________________________________
block_2_expand (Conv2D) (None, 40, 40, 144) 3456 block_1_project_BN[0][0]
__________________________________________________________________________________________________
block_2_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_2_expand[0][0]
__________________________________________________________________________________________________
block_2_expand_relu (ReLU) (None, 40, 40, 144) 0 block_2_expand_BN[0][0]
__________________________________________________________________________________________________
block_2_depthwise (DepthwiseCon (None, 40, 40, 144) 1296 block_2_expand_relu[0][0]
__________________________________________________________________________________________________
block_2_depthwise_BN (BatchNorm (None, 40, 40, 144) 576 block_2_depthwise[0][0]
__________________________________________________________________________________________________
block_2_depthwise_relu (ReLU) (None, 40, 40, 144) 0 block_2_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_2_project (Conv2D) (None, 40, 40, 24) 3456 block_2_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_2_project_BN (BatchNormal (None, 40, 40, 24) 96 block_2_project[0][0]
__________________________________________________________________________________________________
block_2_add (Add) (None, 40, 40, 24) 0 block_1_project_BN[0][0]
block_2_project_BN[0][0]
__________________________________________________________________________________________________
block_3_expand (Conv2D) (None, 40, 40, 144) 3456 block_2_add[0][0]
__________________________________________________________________________________________________
block_3_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_3_expand[0][0]
__________________________________________________________________________________________________
block_3_expand_relu (ReLU) (None, 40, 40, 144) 0 block_3_expand_BN[0][0]
__________________________________________________________________________________________________
block_3_pad (ZeroPadding2D) (None, 41, 41, 144) 0 block_3_expand_relu[0][0]
__________________________________________________________________________________________________
block_3_depthwise (DepthwiseCon (None, 20, 20, 144) 1296 block_3_pad[0][0]
__________________________________________________________________________________________________
block_3_depthwise_BN (BatchNorm (None, 20, 20, 144) 576 block_3_depthwise[0][0]
__________________________________________________________________________________________________
block_3_depthwise_relu (ReLU) (None, 20, 20, 144) 0 block_3_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_3_project (Conv2D) (None, 20, 20, 32) 4608 block_3_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_3_project_BN (BatchNormal (None, 20, 20, 32) 128 block_3_project[0][0]
__________________________________________________________________________________________________
block_4_expand (Conv2D) (None, 20, 20, 192) 6144 block_3_project_BN[0][0]
__________________________________________________________________________________________________
block_4_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_4_expand[0][0]
__________________________________________________________________________________________________
block_4_expand_relu (ReLU) (None, 20, 20, 192) 0 block_4_expand_BN[0][0]
__________________________________________________________________________________________________
block_4_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_4_expand_relu[0][0]
__________________________________________________________________________________________________
block_4_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_4_depthwise[0][0]
__________________________________________________________________________________________________
block_4_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_4_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_4_project (Conv2D) (None, 20, 20, 32) 6144 block_4_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_4_project_BN (BatchNormal (None, 20, 20, 32) 128 block_4_project[0][0]
__________________________________________________________________________________________________
block_4_add (Add) (None, 20, 20, 32) 0 block_3_project_BN[0][0]
block_4_project_BN[0][0]
__________________________________________________________________________________________________
block_5_expand (Conv2D) (None, 20, 20, 192) 6144 block_4_add[0][0]
__________________________________________________________________________________________________
block_5_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_5_expand[0][0]
__________________________________________________________________________________________________
block_5_expand_relu (ReLU) (None, 20, 20, 192) 0 block_5_expand_BN[0][0]
__________________________________________________________________________________________________
block_5_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_5_expand_relu[0][0]
__________________________________________________________________________________________________
block_5_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_5_depthwise[0][0]
__________________________________________________________________________________________________
block_5_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_5_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_5_project (Conv2D) (None, 20, 20, 32) 6144 block_5_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_5_project_BN (BatchNormal (None, 20, 20, 32) 128 block_5_project[0][0]
__________________________________________________________________________________________________
block_5_add (Add) (None, 20, 20, 32) 0 block_4_add[0][0]
block_5_project_BN[0][0]
__________________________________________________________________________________________________
block_6_expand (Conv2D) (None, 20, 20, 192) 6144 block_5_add[0][0]
__________________________________________________________________________________________________
block_6_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_6_expand[0][0]
__________________________________________________________________________________________________
block_6_expand_relu (ReLU) (None, 20, 20, 192) 0 block_6_expand_BN[0][0]
__________________________________________________________________________________________________
block_6_pad (ZeroPadding2D) (None, 21, 21, 192) 0 block_6_expand_relu[0][0]
__________________________________________________________________________________________________
block_6_depthwise (DepthwiseCon (None, 10, 10, 192) 1728 block_6_pad[0][0]
__________________________________________________________________________________________________
block_6_depthwise_BN (BatchNorm (None, 10, 10, 192) 768 block_6_depthwise[0][0]
__________________________________________________________________________________________________
block_6_depthwise_relu (ReLU) (None, 10, 10, 192) 0 block_6_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_6_project (Conv2D) (None, 10, 10, 64) 12288 block_6_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_6_project_BN (BatchNormal (None, 10, 10, 64) 256 block_6_project[0][0]
__________________________________________________________________________________________________
block_7_expand (Conv2D) (None, 10, 10, 384) 24576 block_6_project_BN[0][0]
__________________________________________________________________________________________________
block_7_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_7_expand[0][0]
__________________________________________________________________________________________________
block_7_expand_relu (ReLU) (None, 10, 10, 384) 0 block_7_expand_BN[0][0]
__________________________________________________________________________________________________
block_7_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_7_expand_relu[0][0]
__________________________________________________________________________________________________
block_7_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_7_depthwise[0][0]
__________________________________________________________________________________________________
block_7_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_7_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_7_project (Conv2D) (None, 10, 10, 64) 24576 block_7_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_7_project_BN (BatchNormal (None, 10, 10, 64) 256 block_7_project[0][0]
__________________________________________________________________________________________________
block_7_add (Add) (None, 10, 10, 64) 0 block_6_project_BN[0][0]
block_7_project_BN[0][0]
__________________________________________________________________________________________________
block_8_expand (Conv2D) (None, 10, 10, 384) 24576 block_7_add[0][0]
__________________________________________________________________________________________________
block_8_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_8_expand[0][0]
__________________________________________________________________________________________________
block_8_expand_relu (ReLU) (None, 10, 10, 384) 0 block_8_expand_BN[0][0]
__________________________________________________________________________________________________
block_8_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_8_expand_relu[0][0]
__________________________________________________________________________________________________
block_8_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_8_depthwise[0][0]
__________________________________________________________________________________________________
block_8_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_8_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_8_project (Conv2D) (None, 10, 10, 64) 24576 block_8_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_8_project_BN (BatchNormal (None, 10, 10, 64) 256 block_8_project[0][0]
__________________________________________________________________________________________________
block_8_add (Add) (None, 10, 10, 64) 0 block_7_add[0][0]
block_8_project_BN[0][0]
__________________________________________________________________________________________________
block_9_expand (Conv2D) (None, 10, 10, 384) 24576 block_8_add[0][0]
__________________________________________________________________________________________________
block_9_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_9_expand[0][0]
__________________________________________________________________________________________________
block_9_expand_relu (ReLU) (None, 10, 10, 384) 0 block_9_expand_BN[0][0]
__________________________________________________________________________________________________
block_9_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_9_expand_relu[0][0]
__________________________________________________________________________________________________
block_9_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_9_depthwise[0][0]
__________________________________________________________________________________________________
block_9_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_9_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_9_project (Conv2D) (None, 10, 10, 64) 24576 block_9_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_9_project_BN (BatchNormal (None, 10, 10, 64) 256 block_9_project[0][0]
__________________________________________________________________________________________________
block_9_add (Add) (None, 10, 10, 64) 0 block_8_add[0][0]
block_9_project_BN[0][0]
__________________________________________________________________________________________________
block_10_expand (Conv2D) (None, 10, 10, 384) 24576 block_9_add[0][0]
__________________________________________________________________________________________________
block_10_expand_BN (BatchNormal (None, 10, 10, 384) 1536 block_10_expand[0][0]
__________________________________________________________________________________________________
block_10_expand_relu (ReLU) (None, 10, 10, 384) 0 block_10_expand_BN[0][0]
__________________________________________________________________________________________________
block_10_depthwise (DepthwiseCo (None, 10, 10, 384) 3456 block_10_expand_relu[0][0]
__________________________________________________________________________________________________
block_10_depthwise_BN (BatchNor (None, 10, 10, 384) 1536 block_10_depthwise[0][0]
__________________________________________________________________________________________________
block_10_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_10_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_10_project (Conv2D) (None, 10, 10, 96) 36864 block_10_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_10_project_BN (BatchNorma (None, 10, 10, 96) 384 block_10_project[0][0]
__________________________________________________________________________________________________
block_11_expand (Conv2D) (None, 10, 10, 576) 55296 block_10_project_BN[0][0]
__________________________________________________________________________________________________
block_11_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_11_expand[0][0]
__________________________________________________________________________________________________
block_11_expand_relu (ReLU) (None, 10, 10, 576) 0 block_11_expand_BN[0][0]
__________________________________________________________________________________________________
block_11_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_11_expand_relu[0][0]
__________________________________________________________________________________________________
block_11_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_11_depthwise[0][0]
__________________________________________________________________________________________________
block_11_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_11_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_11_project (Conv2D) (None, 10, 10, 96) 55296 block_11_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_11_project_BN (BatchNorma (None, 10, 10, 96) 384 block_11_project[0][0]
__________________________________________________________________________________________________
block_11_add (Add) (None, 10, 10, 96) 0 block_10_project_BN[0][0]
block_11_project_BN[0][0]
__________________________________________________________________________________________________
block_12_expand (Conv2D) (None, 10, 10, 576) 55296 block_11_add[0][0]
__________________________________________________________________________________________________
block_12_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_12_expand[0][0]
__________________________________________________________________________________________________
block_12_expand_relu (ReLU) (None, 10, 10, 576) 0 block_12_expand_BN[0][0]
__________________________________________________________________________________________________
block_12_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_12_expand_relu[0][0]
__________________________________________________________________________________________________
block_12_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_12_depthwise[0][0]
__________________________________________________________________________________________________
block_12_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_12_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_12_project (Conv2D) (None, 10, 10, 96) 55296 block_12_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_12_project_BN (BatchNorma (None, 10, 10, 96) 384 block_12_project[0][0]
__________________________________________________________________________________________________
block_12_add (Add) (None, 10, 10, 96) 0 block_11_add[0][0]
block_12_project_BN[0][0]
__________________________________________________________________________________________________
block_13_expand (Conv2D) (None, 10, 10, 576) 55296 block_12_add[0][0]
__________________________________________________________________________________________________
block_13_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_13_expand[0][0]
__________________________________________________________________________________________________
block_13_expand_relu (ReLU) (None, 10, 10, 576) 0 block_13_expand_BN[0][0]
__________________________________________________________________________________________________
block_13_pad (ZeroPadding2D) (None, 11, 11, 576) 0 block_13_expand_relu[0][0]
__________________________________________________________________________________________________
block_13_depthwise (DepthwiseCo (None, 5, 5, 576) 5184 block_13_pad[0][0]
__________________________________________________________________________________________________
block_13_depthwise_BN (BatchNor (None, 5, 5, 576) 2304 block_13_depthwise[0][0]
__________________________________________________________________________________________________
block_13_depthwise_relu (ReLU) (None, 5, 5, 576) 0 block_13_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_13_project (Conv2D) (None, 5, 5, 160) 92160 block_13_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_13_project_BN (BatchNorma (None, 5, 5, 160) 640 block_13_project[0][0]
__________________________________________________________________________________________________
block_14_expand (Conv2D) (None, 5, 5, 960) 153600 block_13_project_BN[0][0]
__________________________________________________________________________________________________
block_14_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_14_expand[0][0]
__________________________________________________________________________________________________
block_14_expand_relu (ReLU) (None, 5, 5, 960) 0 block_14_expand_BN[0][0]
__________________________________________________________________________________________________
block_14_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_14_expand_relu[0][0]
__________________________________________________________________________________________________
block_14_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_14_depthwise[0][0]
__________________________________________________________________________________________________
block_14_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_14_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_14_project (Conv2D) (None, 5, 5, 160) 153600 block_14_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_14_project_BN (BatchNorma (None, 5, 5, 160) 640 block_14_project[0][0]
__________________________________________________________________________________________________
block_14_add (Add) (None, 5, 5, 160) 0 block_13_project_BN[0][0]
block_14_project_BN[0][0]
__________________________________________________________________________________________________
block_15_expand (Conv2D) (None, 5, 5, 960) 153600 block_14_add[0][0]
__________________________________________________________________________________________________
block_15_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_15_expand[0][0]
__________________________________________________________________________________________________
block_15_expand_relu (ReLU) (None, 5, 5, 960) 0 block_15_expand_BN[0][0]
__________________________________________________________________________________________________
block_15_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_15_expand_relu[0][0]
__________________________________________________________________________________________________
block_15_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_15_depthwise[0][0]
__________________________________________________________________________________________________
block_15_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_15_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_15_project (Conv2D) (None, 5, 5, 160) 153600 block_15_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_15_project_BN (BatchNorma (None, 5, 5, 160) 640 block_15_project[0][0]
__________________________________________________________________________________________________
block_15_add (Add) (None, 5, 5, 160) 0 block_14_add[0][0]
block_15_project_BN[0][0]
__________________________________________________________________________________________________
block_16_expand (Conv2D) (None, 5, 5, 960) 153600 block_15_add[0][0]
__________________________________________________________________________________________________
block_16_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_16_expand[0][0]
__________________________________________________________________________________________________
block_16_expand_relu (ReLU) (None, 5, 5, 960) 0 block_16_expand_BN[0][0]
__________________________________________________________________________________________________
block_16_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_16_expand_relu[0][0]
__________________________________________________________________________________________________
block_16_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_16_depthwise[0][0]
__________________________________________________________________________________________________
block_16_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_16_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_16_project (Conv2D) (None, 5, 5, 320) 307200 block_16_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_16_project_BN (BatchNorma (None, 5, 5, 320) 1280 block_16_project[0][0]
__________________________________________________________________________________________________
Conv_1 (Conv2D) (None, 5, 5, 1280) 409600 block_16_project_BN[0][0]
__________________________________________________________________________________________________
Conv_1_bn (BatchNormalization) (None, 5, 5, 1280) 5120 Conv_1[0][0]
__________________________________________________________________________________________________
out_relu (ReLU) (None, 5, 5, 1280) 0 Conv_1_bn[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_6 (Glo (None, 1280) 0 out_relu[0][0]
==================================================================================================
Total params: 2,257,984
Trainable params: 2,223,872
Non-trainable params: 34,112
__________________________________________________________________________________________________
You can now construct new final classifier layers for your model. Using the Sequential API, create a new model according to the following specifications:
In total, the network should be composed of the pretrained base model plus 3 layers.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def add_new_classifier_head(feature_extractor_model):
"""
This function takes the feature extractor model as an argument, and should create
and return a new model according to the above specification.
"""
inputs = feature_extractor_model.inputs
h = Dense(units = 32, activation = 'relu')(feature_extractor_model.output)
h = Dropout(rate = 0.5)(h)
outputs = Dense(units = 1, activation = 'sigmoid')(h)
return Model(inputs = inputs, outputs = outputs)
# Call the function adding a new classification head and display the summary
new_model = add_new_classifier_head(feature_extractor)
new_model.summary()
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_6 (InputLayer) [(None, 160, 160, 3) 0
__________________________________________________________________________________________________
Conv1_pad (ZeroPadding2D) (None, 161, 161, 3) 0 input_6[0][0]
__________________________________________________________________________________________________
Conv1 (Conv2D) (None, 80, 80, 32) 864 Conv1_pad[0][0]
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalization) (None, 80, 80, 32) 128 Conv1[0][0]
__________________________________________________________________________________________________
Conv1_relu (ReLU) (None, 80, 80, 32) 0 bn_Conv1[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise (Depthw (None, 80, 80, 32) 288 Conv1_relu[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_BN (Bat (None, 80, 80, 32) 128 expanded_conv_depthwise[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_relu (R (None, 80, 80, 32) 0 expanded_conv_depthwise_BN[0][0]
__________________________________________________________________________________________________
expanded_conv_project (Conv2D) (None, 80, 80, 16) 512 expanded_conv_depthwise_relu[0][0
__________________________________________________________________________________________________
expanded_conv_project_BN (Batch (None, 80, 80, 16) 64 expanded_conv_project[0][0]
__________________________________________________________________________________________________
block_1_expand (Conv2D) (None, 80, 80, 96) 1536 expanded_conv_project_BN[0][0]
__________________________________________________________________________________________________
block_1_expand_BN (BatchNormali (None, 80, 80, 96) 384 block_1_expand[0][0]
__________________________________________________________________________________________________
block_1_expand_relu (ReLU) (None, 80, 80, 96) 0 block_1_expand_BN[0][0]
__________________________________________________________________________________________________
block_1_pad (ZeroPadding2D) (None, 81, 81, 96) 0 block_1_expand_relu[0][0]
__________________________________________________________________________________________________
block_1_depthwise (DepthwiseCon (None, 40, 40, 96) 864 block_1_pad[0][0]
__________________________________________________________________________________________________
block_1_depthwise_BN (BatchNorm (None, 40, 40, 96) 384 block_1_depthwise[0][0]
__________________________________________________________________________________________________
block_1_depthwise_relu (ReLU) (None, 40, 40, 96) 0 block_1_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_1_project (Conv2D) (None, 40, 40, 24) 2304 block_1_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_1_project_BN (BatchNormal (None, 40, 40, 24) 96 block_1_project[0][0]
__________________________________________________________________________________________________
block_2_expand (Conv2D) (None, 40, 40, 144) 3456 block_1_project_BN[0][0]
__________________________________________________________________________________________________
block_2_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_2_expand[0][0]
__________________________________________________________________________________________________
block_2_expand_relu (ReLU) (None, 40, 40, 144) 0 block_2_expand_BN[0][0]
__________________________________________________________________________________________________
block_2_depthwise (DepthwiseCon (None, 40, 40, 144) 1296 block_2_expand_relu[0][0]
__________________________________________________________________________________________________
block_2_depthwise_BN (BatchNorm (None, 40, 40, 144) 576 block_2_depthwise[0][0]
__________________________________________________________________________________________________
block_2_depthwise_relu (ReLU) (None, 40, 40, 144) 0 block_2_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_2_project (Conv2D) (None, 40, 40, 24) 3456 block_2_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_2_project_BN (BatchNormal (None, 40, 40, 24) 96 block_2_project[0][0]
__________________________________________________________________________________________________
block_2_add (Add) (None, 40, 40, 24) 0 block_1_project_BN[0][0]
block_2_project_BN[0][0]
__________________________________________________________________________________________________
block_3_expand (Conv2D) (None, 40, 40, 144) 3456 block_2_add[0][0]
__________________________________________________________________________________________________
block_3_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_3_expand[0][0]
__________________________________________________________________________________________________
block_3_expand_relu (ReLU) (None, 40, 40, 144) 0 block_3_expand_BN[0][0]
__________________________________________________________________________________________________
block_3_pad (ZeroPadding2D) (None, 41, 41, 144) 0 block_3_expand_relu[0][0]
__________________________________________________________________________________________________
block_3_depthwise (DepthwiseCon (None, 20, 20, 144) 1296 block_3_pad[0][0]
__________________________________________________________________________________________________
block_3_depthwise_BN (BatchNorm (None, 20, 20, 144) 576 block_3_depthwise[0][0]
__________________________________________________________________________________________________
block_3_depthwise_relu (ReLU) (None, 20, 20, 144) 0 block_3_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_3_project (Conv2D) (None, 20, 20, 32) 4608 block_3_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_3_project_BN (BatchNormal (None, 20, 20, 32) 128 block_3_project[0][0]
__________________________________________________________________________________________________
block_4_expand (Conv2D) (None, 20, 20, 192) 6144 block_3_project_BN[0][0]
__________________________________________________________________________________________________
block_4_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_4_expand[0][0]
__________________________________________________________________________________________________
block_4_expand_relu (ReLU) (None, 20, 20, 192) 0 block_4_expand_BN[0][0]
__________________________________________________________________________________________________
block_4_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_4_expand_relu[0][0]
__________________________________________________________________________________________________
block_4_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_4_depthwise[0][0]
__________________________________________________________________________________________________
block_4_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_4_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_4_project (Conv2D) (None, 20, 20, 32) 6144 block_4_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_4_project_BN (BatchNormal (None, 20, 20, 32) 128 block_4_project[0][0]
__________________________________________________________________________________________________
block_4_add (Add) (None, 20, 20, 32) 0 block_3_project_BN[0][0]
block_4_project_BN[0][0]
__________________________________________________________________________________________________
block_5_expand (Conv2D) (None, 20, 20, 192) 6144 block_4_add[0][0]
__________________________________________________________________________________________________
block_5_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_5_expand[0][0]
__________________________________________________________________________________________________
block_5_expand_relu (ReLU) (None, 20, 20, 192) 0 block_5_expand_BN[0][0]
__________________________________________________________________________________________________
block_5_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_5_expand_relu[0][0]
__________________________________________________________________________________________________
block_5_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_5_depthwise[0][0]
__________________________________________________________________________________________________
block_5_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_5_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_5_project (Conv2D) (None, 20, 20, 32) 6144 block_5_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_5_project_BN (BatchNormal (None, 20, 20, 32) 128 block_5_project[0][0]
__________________________________________________________________________________________________
block_5_add (Add) (None, 20, 20, 32) 0 block_4_add[0][0]
block_5_project_BN[0][0]
__________________________________________________________________________________________________
block_6_expand (Conv2D) (None, 20, 20, 192) 6144 block_5_add[0][0]
__________________________________________________________________________________________________
block_6_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_6_expand[0][0]
__________________________________________________________________________________________________
block_6_expand_relu (ReLU) (None, 20, 20, 192) 0 block_6_expand_BN[0][0]
__________________________________________________________________________________________________
block_6_pad (ZeroPadding2D) (None, 21, 21, 192) 0 block_6_expand_relu[0][0]
__________________________________________________________________________________________________
block_6_depthwise (DepthwiseCon (None, 10, 10, 192) 1728 block_6_pad[0][0]
__________________________________________________________________________________________________
block_6_depthwise_BN (BatchNorm (None, 10, 10, 192) 768 block_6_depthwise[0][0]
__________________________________________________________________________________________________
block_6_depthwise_relu (ReLU) (None, 10, 10, 192) 0 block_6_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_6_project (Conv2D) (None, 10, 10, 64) 12288 block_6_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_6_project_BN (BatchNormal (None, 10, 10, 64) 256 block_6_project[0][0]
__________________________________________________________________________________________________
block_7_expand (Conv2D) (None, 10, 10, 384) 24576 block_6_project_BN[0][0]
__________________________________________________________________________________________________
block_7_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_7_expand[0][0]
__________________________________________________________________________________________________
block_7_expand_relu (ReLU) (None, 10, 10, 384) 0 block_7_expand_BN[0][0]
__________________________________________________________________________________________________
block_7_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_7_expand_relu[0][0]
__________________________________________________________________________________________________
block_7_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_7_depthwise[0][0]
__________________________________________________________________________________________________
block_7_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_7_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_7_project (Conv2D) (None, 10, 10, 64) 24576 block_7_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_7_project_BN (BatchNormal (None, 10, 10, 64) 256 block_7_project[0][0]
__________________________________________________________________________________________________
block_7_add (Add) (None, 10, 10, 64) 0 block_6_project_BN[0][0]
block_7_project_BN[0][0]
__________________________________________________________________________________________________
block_8_expand (Conv2D) (None, 10, 10, 384) 24576 block_7_add[0][0]
__________________________________________________________________________________________________
block_8_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_8_expand[0][0]
__________________________________________________________________________________________________
block_8_expand_relu (ReLU) (None, 10, 10, 384) 0 block_8_expand_BN[0][0]
__________________________________________________________________________________________________
block_8_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_8_expand_relu[0][0]
__________________________________________________________________________________________________
block_8_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_8_depthwise[0][0]
__________________________________________________________________________________________________
block_8_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_8_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_8_project (Conv2D) (None, 10, 10, 64) 24576 block_8_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_8_project_BN (BatchNormal (None, 10, 10, 64) 256 block_8_project[0][0]
__________________________________________________________________________________________________
block_8_add (Add) (None, 10, 10, 64) 0 block_7_add[0][0]
block_8_project_BN[0][0]
__________________________________________________________________________________________________
block_9_expand (Conv2D) (None, 10, 10, 384) 24576 block_8_add[0][0]
__________________________________________________________________________________________________
block_9_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_9_expand[0][0]
__________________________________________________________________________________________________
block_9_expand_relu (ReLU) (None, 10, 10, 384) 0 block_9_expand_BN[0][0]
__________________________________________________________________________________________________
block_9_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_9_expand_relu[0][0]
__________________________________________________________________________________________________
block_9_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_9_depthwise[0][0]
__________________________________________________________________________________________________
block_9_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_9_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_9_project (Conv2D) (None, 10, 10, 64) 24576 block_9_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_9_project_BN (BatchNormal (None, 10, 10, 64) 256 block_9_project[0][0]
__________________________________________________________________________________________________
block_9_add (Add) (None, 10, 10, 64) 0 block_8_add[0][0]
block_9_project_BN[0][0]
__________________________________________________________________________________________________
block_10_expand (Conv2D) (None, 10, 10, 384) 24576 block_9_add[0][0]
__________________________________________________________________________________________________
block_10_expand_BN (BatchNormal (None, 10, 10, 384) 1536 block_10_expand[0][0]
__________________________________________________________________________________________________
block_10_expand_relu (ReLU) (None, 10, 10, 384) 0 block_10_expand_BN[0][0]
__________________________________________________________________________________________________
block_10_depthwise (DepthwiseCo (None, 10, 10, 384) 3456 block_10_expand_relu[0][0]
__________________________________________________________________________________________________
block_10_depthwise_BN (BatchNor (None, 10, 10, 384) 1536 block_10_depthwise[0][0]
__________________________________________________________________________________________________
block_10_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_10_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_10_project (Conv2D) (None, 10, 10, 96) 36864 block_10_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_10_project_BN (BatchNorma (None, 10, 10, 96) 384 block_10_project[0][0]
__________________________________________________________________________________________________
block_11_expand (Conv2D) (None, 10, 10, 576) 55296 block_10_project_BN[0][0]
__________________________________________________________________________________________________
block_11_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_11_expand[0][0]
__________________________________________________________________________________________________
block_11_expand_relu (ReLU) (None, 10, 10, 576) 0 block_11_expand_BN[0][0]
__________________________________________________________________________________________________
block_11_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_11_expand_relu[0][0]
__________________________________________________________________________________________________
block_11_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_11_depthwise[0][0]
__________________________________________________________________________________________________
block_11_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_11_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_11_project (Conv2D) (None, 10, 10, 96) 55296 block_11_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_11_project_BN (BatchNorma (None, 10, 10, 96) 384 block_11_project[0][0]
__________________________________________________________________________________________________
block_11_add (Add) (None, 10, 10, 96) 0 block_10_project_BN[0][0]
block_11_project_BN[0][0]
__________________________________________________________________________________________________
block_12_expand (Conv2D) (None, 10, 10, 576) 55296 block_11_add[0][0]
__________________________________________________________________________________________________
block_12_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_12_expand[0][0]
__________________________________________________________________________________________________
block_12_expand_relu (ReLU) (None, 10, 10, 576) 0 block_12_expand_BN[0][0]
__________________________________________________________________________________________________
block_12_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_12_expand_relu[0][0]
__________________________________________________________________________________________________
block_12_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_12_depthwise[0][0]
__________________________________________________________________________________________________
block_12_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_12_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_12_project (Conv2D) (None, 10, 10, 96) 55296 block_12_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_12_project_BN (BatchNorma (None, 10, 10, 96) 384 block_12_project[0][0]
__________________________________________________________________________________________________
block_12_add (Add) (None, 10, 10, 96) 0 block_11_add[0][0]
block_12_project_BN[0][0]
__________________________________________________________________________________________________
block_13_expand (Conv2D) (None, 10, 10, 576) 55296 block_12_add[0][0]
__________________________________________________________________________________________________
block_13_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_13_expand[0][0]
__________________________________________________________________________________________________
block_13_expand_relu (ReLU) (None, 10, 10, 576) 0 block_13_expand_BN[0][0]
__________________________________________________________________________________________________
block_13_pad (ZeroPadding2D) (None, 11, 11, 576) 0 block_13_expand_relu[0][0]
__________________________________________________________________________________________________
block_13_depthwise (DepthwiseCo (None, 5, 5, 576) 5184 block_13_pad[0][0]
__________________________________________________________________________________________________
block_13_depthwise_BN (BatchNor (None, 5, 5, 576) 2304 block_13_depthwise[0][0]
__________________________________________________________________________________________________
block_13_depthwise_relu (ReLU) (None, 5, 5, 576) 0 block_13_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_13_project (Conv2D) (None, 5, 5, 160) 92160 block_13_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_13_project_BN (BatchNorma (None, 5, 5, 160) 640 block_13_project[0][0]
__________________________________________________________________________________________________
block_14_expand (Conv2D) (None, 5, 5, 960) 153600 block_13_project_BN[0][0]
__________________________________________________________________________________________________
block_14_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_14_expand[0][0]
__________________________________________________________________________________________________
block_14_expand_relu (ReLU) (None, 5, 5, 960) 0 block_14_expand_BN[0][0]
__________________________________________________________________________________________________
block_14_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_14_expand_relu[0][0]
__________________________________________________________________________________________________
block_14_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_14_depthwise[0][0]
__________________________________________________________________________________________________
block_14_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_14_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_14_project (Conv2D) (None, 5, 5, 160) 153600 block_14_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_14_project_BN (BatchNorma (None, 5, 5, 160) 640 block_14_project[0][0]
__________________________________________________________________________________________________
block_14_add (Add) (None, 5, 5, 160) 0 block_13_project_BN[0][0]
block_14_project_BN[0][0]
__________________________________________________________________________________________________
block_15_expand (Conv2D) (None, 5, 5, 960) 153600 block_14_add[0][0]
__________________________________________________________________________________________________
block_15_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_15_expand[0][0]
__________________________________________________________________________________________________
block_15_expand_relu (ReLU) (None, 5, 5, 960) 0 block_15_expand_BN[0][0]
__________________________________________________________________________________________________
block_15_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_15_expand_relu[0][0]
__________________________________________________________________________________________________
block_15_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_15_depthwise[0][0]
__________________________________________________________________________________________________
block_15_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_15_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_15_project (Conv2D) (None, 5, 5, 160) 153600 block_15_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_15_project_BN (BatchNorma (None, 5, 5, 160) 640 block_15_project[0][0]
__________________________________________________________________________________________________
block_15_add (Add) (None, 5, 5, 160) 0 block_14_add[0][0]
block_15_project_BN[0][0]
__________________________________________________________________________________________________
block_16_expand (Conv2D) (None, 5, 5, 960) 153600 block_15_add[0][0]
__________________________________________________________________________________________________
block_16_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_16_expand[0][0]
__________________________________________________________________________________________________
block_16_expand_relu (ReLU) (None, 5, 5, 960) 0 block_16_expand_BN[0][0]
__________________________________________________________________________________________________
block_16_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_16_expand_relu[0][0]
__________________________________________________________________________________________________
block_16_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_16_depthwise[0][0]
__________________________________________________________________________________________________
block_16_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_16_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_16_project (Conv2D) (None, 5, 5, 320) 307200 block_16_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_16_project_BN (BatchNorma (None, 5, 5, 320) 1280 block_16_project[0][0]
__________________________________________________________________________________________________
Conv_1 (Conv2D) (None, 5, 5, 1280) 409600 block_16_project_BN[0][0]
__________________________________________________________________________________________________
Conv_1_bn (BatchNormalization) (None, 5, 5, 1280) 5120 Conv_1[0][0]
__________________________________________________________________________________________________
out_relu (ReLU) (None, 5, 5, 1280) 0 Conv_1_bn[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_6 (Glo (None, 1280) 0 out_relu[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 32) 40992 global_average_pooling2d_6[0][0]
__________________________________________________________________________________________________
dropout (Dropout) (None, 32) 0 dense_2[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 1) 33 dropout[0][0]
==================================================================================================
Total params: 2,299,009
Trainable params: 2,264,897
Non-trainable params: 34,112
__________________________________________________________________________________________________
You will now need to freeze the weights of the pre-trained feature extractor, so that only the weights of the new layers you have added will change during the training.
You should then compile your model as before: use the RMSProp optimiser with learning rate 0.001, binary cross entropy loss and and binary accuracy metric.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def freeze_pretrained_weights(model):
"""
This function should freeze the weights of the pretrained base model.
Your function should return the model with frozen weights.
"""
for layer in model.layers[:-3]: layer.trainable = False
model.compile (
optimizer = RMSprop(learning_rate = 1e-3),
loss = 'binary_crossentropy',
metrics = ['accuracy']
)
return model
# Call the function freezing the pretrained weights and display the summary
frozen_new_model = freeze_pretrained_weights(new_model)
frozen_new_model.summary()
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_6 (InputLayer) [(None, 160, 160, 3) 0
__________________________________________________________________________________________________
Conv1_pad (ZeroPadding2D) (None, 161, 161, 3) 0 input_6[0][0]
__________________________________________________________________________________________________
Conv1 (Conv2D) (None, 80, 80, 32) 864 Conv1_pad[0][0]
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalization) (None, 80, 80, 32) 128 Conv1[0][0]
__________________________________________________________________________________________________
Conv1_relu (ReLU) (None, 80, 80, 32) 0 bn_Conv1[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise (Depthw (None, 80, 80, 32) 288 Conv1_relu[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_BN (Bat (None, 80, 80, 32) 128 expanded_conv_depthwise[0][0]
__________________________________________________________________________________________________
expanded_conv_depthwise_relu (R (None, 80, 80, 32) 0 expanded_conv_depthwise_BN[0][0]
__________________________________________________________________________________________________
expanded_conv_project (Conv2D) (None, 80, 80, 16) 512 expanded_conv_depthwise_relu[0][0
__________________________________________________________________________________________________
expanded_conv_project_BN (Batch (None, 80, 80, 16) 64 expanded_conv_project[0][0]
__________________________________________________________________________________________________
block_1_expand (Conv2D) (None, 80, 80, 96) 1536 expanded_conv_project_BN[0][0]
__________________________________________________________________________________________________
block_1_expand_BN (BatchNormali (None, 80, 80, 96) 384 block_1_expand[0][0]
__________________________________________________________________________________________________
block_1_expand_relu (ReLU) (None, 80, 80, 96) 0 block_1_expand_BN[0][0]
__________________________________________________________________________________________________
block_1_pad (ZeroPadding2D) (None, 81, 81, 96) 0 block_1_expand_relu[0][0]
__________________________________________________________________________________________________
block_1_depthwise (DepthwiseCon (None, 40, 40, 96) 864 block_1_pad[0][0]
__________________________________________________________________________________________________
block_1_depthwise_BN (BatchNorm (None, 40, 40, 96) 384 block_1_depthwise[0][0]
__________________________________________________________________________________________________
block_1_depthwise_relu (ReLU) (None, 40, 40, 96) 0 block_1_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_1_project (Conv2D) (None, 40, 40, 24) 2304 block_1_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_1_project_BN (BatchNormal (None, 40, 40, 24) 96 block_1_project[0][0]
__________________________________________________________________________________________________
block_2_expand (Conv2D) (None, 40, 40, 144) 3456 block_1_project_BN[0][0]
__________________________________________________________________________________________________
block_2_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_2_expand[0][0]
__________________________________________________________________________________________________
block_2_expand_relu (ReLU) (None, 40, 40, 144) 0 block_2_expand_BN[0][0]
__________________________________________________________________________________________________
block_2_depthwise (DepthwiseCon (None, 40, 40, 144) 1296 block_2_expand_relu[0][0]
__________________________________________________________________________________________________
block_2_depthwise_BN (BatchNorm (None, 40, 40, 144) 576 block_2_depthwise[0][0]
__________________________________________________________________________________________________
block_2_depthwise_relu (ReLU) (None, 40, 40, 144) 0 block_2_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_2_project (Conv2D) (None, 40, 40, 24) 3456 block_2_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_2_project_BN (BatchNormal (None, 40, 40, 24) 96 block_2_project[0][0]
__________________________________________________________________________________________________
block_2_add (Add) (None, 40, 40, 24) 0 block_1_project_BN[0][0]
block_2_project_BN[0][0]
__________________________________________________________________________________________________
block_3_expand (Conv2D) (None, 40, 40, 144) 3456 block_2_add[0][0]
__________________________________________________________________________________________________
block_3_expand_BN (BatchNormali (None, 40, 40, 144) 576 block_3_expand[0][0]
__________________________________________________________________________________________________
block_3_expand_relu (ReLU) (None, 40, 40, 144) 0 block_3_expand_BN[0][0]
__________________________________________________________________________________________________
block_3_pad (ZeroPadding2D) (None, 41, 41, 144) 0 block_3_expand_relu[0][0]
__________________________________________________________________________________________________
block_3_depthwise (DepthwiseCon (None, 20, 20, 144) 1296 block_3_pad[0][0]
__________________________________________________________________________________________________
block_3_depthwise_BN (BatchNorm (None, 20, 20, 144) 576 block_3_depthwise[0][0]
__________________________________________________________________________________________________
block_3_depthwise_relu (ReLU) (None, 20, 20, 144) 0 block_3_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_3_project (Conv2D) (None, 20, 20, 32) 4608 block_3_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_3_project_BN (BatchNormal (None, 20, 20, 32) 128 block_3_project[0][0]
__________________________________________________________________________________________________
block_4_expand (Conv2D) (None, 20, 20, 192) 6144 block_3_project_BN[0][0]
__________________________________________________________________________________________________
block_4_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_4_expand[0][0]
__________________________________________________________________________________________________
block_4_expand_relu (ReLU) (None, 20, 20, 192) 0 block_4_expand_BN[0][0]
__________________________________________________________________________________________________
block_4_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_4_expand_relu[0][0]
__________________________________________________________________________________________________
block_4_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_4_depthwise[0][0]
__________________________________________________________________________________________________
block_4_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_4_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_4_project (Conv2D) (None, 20, 20, 32) 6144 block_4_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_4_project_BN (BatchNormal (None, 20, 20, 32) 128 block_4_project[0][0]
__________________________________________________________________________________________________
block_4_add (Add) (None, 20, 20, 32) 0 block_3_project_BN[0][0]
block_4_project_BN[0][0]
__________________________________________________________________________________________________
block_5_expand (Conv2D) (None, 20, 20, 192) 6144 block_4_add[0][0]
__________________________________________________________________________________________________
block_5_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_5_expand[0][0]
__________________________________________________________________________________________________
block_5_expand_relu (ReLU) (None, 20, 20, 192) 0 block_5_expand_BN[0][0]
__________________________________________________________________________________________________
block_5_depthwise (DepthwiseCon (None, 20, 20, 192) 1728 block_5_expand_relu[0][0]
__________________________________________________________________________________________________
block_5_depthwise_BN (BatchNorm (None, 20, 20, 192) 768 block_5_depthwise[0][0]
__________________________________________________________________________________________________
block_5_depthwise_relu (ReLU) (None, 20, 20, 192) 0 block_5_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_5_project (Conv2D) (None, 20, 20, 32) 6144 block_5_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_5_project_BN (BatchNormal (None, 20, 20, 32) 128 block_5_project[0][0]
__________________________________________________________________________________________________
block_5_add (Add) (None, 20, 20, 32) 0 block_4_add[0][0]
block_5_project_BN[0][0]
__________________________________________________________________________________________________
block_6_expand (Conv2D) (None, 20, 20, 192) 6144 block_5_add[0][0]
__________________________________________________________________________________________________
block_6_expand_BN (BatchNormali (None, 20, 20, 192) 768 block_6_expand[0][0]
__________________________________________________________________________________________________
block_6_expand_relu (ReLU) (None, 20, 20, 192) 0 block_6_expand_BN[0][0]
__________________________________________________________________________________________________
block_6_pad (ZeroPadding2D) (None, 21, 21, 192) 0 block_6_expand_relu[0][0]
__________________________________________________________________________________________________
block_6_depthwise (DepthwiseCon (None, 10, 10, 192) 1728 block_6_pad[0][0]
__________________________________________________________________________________________________
block_6_depthwise_BN (BatchNorm (None, 10, 10, 192) 768 block_6_depthwise[0][0]
__________________________________________________________________________________________________
block_6_depthwise_relu (ReLU) (None, 10, 10, 192) 0 block_6_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_6_project (Conv2D) (None, 10, 10, 64) 12288 block_6_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_6_project_BN (BatchNormal (None, 10, 10, 64) 256 block_6_project[0][0]
__________________________________________________________________________________________________
block_7_expand (Conv2D) (None, 10, 10, 384) 24576 block_6_project_BN[0][0]
__________________________________________________________________________________________________
block_7_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_7_expand[0][0]
__________________________________________________________________________________________________
block_7_expand_relu (ReLU) (None, 10, 10, 384) 0 block_7_expand_BN[0][0]
__________________________________________________________________________________________________
block_7_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_7_expand_relu[0][0]
__________________________________________________________________________________________________
block_7_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_7_depthwise[0][0]
__________________________________________________________________________________________________
block_7_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_7_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_7_project (Conv2D) (None, 10, 10, 64) 24576 block_7_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_7_project_BN (BatchNormal (None, 10, 10, 64) 256 block_7_project[0][0]
__________________________________________________________________________________________________
block_7_add (Add) (None, 10, 10, 64) 0 block_6_project_BN[0][0]
block_7_project_BN[0][0]
__________________________________________________________________________________________________
block_8_expand (Conv2D) (None, 10, 10, 384) 24576 block_7_add[0][0]
__________________________________________________________________________________________________
block_8_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_8_expand[0][0]
__________________________________________________________________________________________________
block_8_expand_relu (ReLU) (None, 10, 10, 384) 0 block_8_expand_BN[0][0]
__________________________________________________________________________________________________
block_8_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_8_expand_relu[0][0]
__________________________________________________________________________________________________
block_8_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_8_depthwise[0][0]
__________________________________________________________________________________________________
block_8_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_8_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_8_project (Conv2D) (None, 10, 10, 64) 24576 block_8_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_8_project_BN (BatchNormal (None, 10, 10, 64) 256 block_8_project[0][0]
__________________________________________________________________________________________________
block_8_add (Add) (None, 10, 10, 64) 0 block_7_add[0][0]
block_8_project_BN[0][0]
__________________________________________________________________________________________________
block_9_expand (Conv2D) (None, 10, 10, 384) 24576 block_8_add[0][0]
__________________________________________________________________________________________________
block_9_expand_BN (BatchNormali (None, 10, 10, 384) 1536 block_9_expand[0][0]
__________________________________________________________________________________________________
block_9_expand_relu (ReLU) (None, 10, 10, 384) 0 block_9_expand_BN[0][0]
__________________________________________________________________________________________________
block_9_depthwise (DepthwiseCon (None, 10, 10, 384) 3456 block_9_expand_relu[0][0]
__________________________________________________________________________________________________
block_9_depthwise_BN (BatchNorm (None, 10, 10, 384) 1536 block_9_depthwise[0][0]
__________________________________________________________________________________________________
block_9_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_9_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_9_project (Conv2D) (None, 10, 10, 64) 24576 block_9_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_9_project_BN (BatchNormal (None, 10, 10, 64) 256 block_9_project[0][0]
__________________________________________________________________________________________________
block_9_add (Add) (None, 10, 10, 64) 0 block_8_add[0][0]
block_9_project_BN[0][0]
__________________________________________________________________________________________________
block_10_expand (Conv2D) (None, 10, 10, 384) 24576 block_9_add[0][0]
__________________________________________________________________________________________________
block_10_expand_BN (BatchNormal (None, 10, 10, 384) 1536 block_10_expand[0][0]
__________________________________________________________________________________________________
block_10_expand_relu (ReLU) (None, 10, 10, 384) 0 block_10_expand_BN[0][0]
__________________________________________________________________________________________________
block_10_depthwise (DepthwiseCo (None, 10, 10, 384) 3456 block_10_expand_relu[0][0]
__________________________________________________________________________________________________
block_10_depthwise_BN (BatchNor (None, 10, 10, 384) 1536 block_10_depthwise[0][0]
__________________________________________________________________________________________________
block_10_depthwise_relu (ReLU) (None, 10, 10, 384) 0 block_10_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_10_project (Conv2D) (None, 10, 10, 96) 36864 block_10_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_10_project_BN (BatchNorma (None, 10, 10, 96) 384 block_10_project[0][0]
__________________________________________________________________________________________________
block_11_expand (Conv2D) (None, 10, 10, 576) 55296 block_10_project_BN[0][0]
__________________________________________________________________________________________________
block_11_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_11_expand[0][0]
__________________________________________________________________________________________________
block_11_expand_relu (ReLU) (None, 10, 10, 576) 0 block_11_expand_BN[0][0]
__________________________________________________________________________________________________
block_11_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_11_expand_relu[0][0]
__________________________________________________________________________________________________
block_11_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_11_depthwise[0][0]
__________________________________________________________________________________________________
block_11_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_11_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_11_project (Conv2D) (None, 10, 10, 96) 55296 block_11_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_11_project_BN (BatchNorma (None, 10, 10, 96) 384 block_11_project[0][0]
__________________________________________________________________________________________________
block_11_add (Add) (None, 10, 10, 96) 0 block_10_project_BN[0][0]
block_11_project_BN[0][0]
__________________________________________________________________________________________________
block_12_expand (Conv2D) (None, 10, 10, 576) 55296 block_11_add[0][0]
__________________________________________________________________________________________________
block_12_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_12_expand[0][0]
__________________________________________________________________________________________________
block_12_expand_relu (ReLU) (None, 10, 10, 576) 0 block_12_expand_BN[0][0]
__________________________________________________________________________________________________
block_12_depthwise (DepthwiseCo (None, 10, 10, 576) 5184 block_12_expand_relu[0][0]
__________________________________________________________________________________________________
block_12_depthwise_BN (BatchNor (None, 10, 10, 576) 2304 block_12_depthwise[0][0]
__________________________________________________________________________________________________
block_12_depthwise_relu (ReLU) (None, 10, 10, 576) 0 block_12_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_12_project (Conv2D) (None, 10, 10, 96) 55296 block_12_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_12_project_BN (BatchNorma (None, 10, 10, 96) 384 block_12_project[0][0]
__________________________________________________________________________________________________
block_12_add (Add) (None, 10, 10, 96) 0 block_11_add[0][0]
block_12_project_BN[0][0]
__________________________________________________________________________________________________
block_13_expand (Conv2D) (None, 10, 10, 576) 55296 block_12_add[0][0]
__________________________________________________________________________________________________
block_13_expand_BN (BatchNormal (None, 10, 10, 576) 2304 block_13_expand[0][0]
__________________________________________________________________________________________________
block_13_expand_relu (ReLU) (None, 10, 10, 576) 0 block_13_expand_BN[0][0]
__________________________________________________________________________________________________
block_13_pad (ZeroPadding2D) (None, 11, 11, 576) 0 block_13_expand_relu[0][0]
__________________________________________________________________________________________________
block_13_depthwise (DepthwiseCo (None, 5, 5, 576) 5184 block_13_pad[0][0]
__________________________________________________________________________________________________
block_13_depthwise_BN (BatchNor (None, 5, 5, 576) 2304 block_13_depthwise[0][0]
__________________________________________________________________________________________________
block_13_depthwise_relu (ReLU) (None, 5, 5, 576) 0 block_13_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_13_project (Conv2D) (None, 5, 5, 160) 92160 block_13_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_13_project_BN (BatchNorma (None, 5, 5, 160) 640 block_13_project[0][0]
__________________________________________________________________________________________________
block_14_expand (Conv2D) (None, 5, 5, 960) 153600 block_13_project_BN[0][0]
__________________________________________________________________________________________________
block_14_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_14_expand[0][0]
__________________________________________________________________________________________________
block_14_expand_relu (ReLU) (None, 5, 5, 960) 0 block_14_expand_BN[0][0]
__________________________________________________________________________________________________
block_14_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_14_expand_relu[0][0]
__________________________________________________________________________________________________
block_14_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_14_depthwise[0][0]
__________________________________________________________________________________________________
block_14_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_14_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_14_project (Conv2D) (None, 5, 5, 160) 153600 block_14_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_14_project_BN (BatchNorma (None, 5, 5, 160) 640 block_14_project[0][0]
__________________________________________________________________________________________________
block_14_add (Add) (None, 5, 5, 160) 0 block_13_project_BN[0][0]
block_14_project_BN[0][0]
__________________________________________________________________________________________________
block_15_expand (Conv2D) (None, 5, 5, 960) 153600 block_14_add[0][0]
__________________________________________________________________________________________________
block_15_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_15_expand[0][0]
__________________________________________________________________________________________________
block_15_expand_relu (ReLU) (None, 5, 5, 960) 0 block_15_expand_BN[0][0]
__________________________________________________________________________________________________
block_15_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_15_expand_relu[0][0]
__________________________________________________________________________________________________
block_15_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_15_depthwise[0][0]
__________________________________________________________________________________________________
block_15_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_15_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_15_project (Conv2D) (None, 5, 5, 160) 153600 block_15_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_15_project_BN (BatchNorma (None, 5, 5, 160) 640 block_15_project[0][0]
__________________________________________________________________________________________________
block_15_add (Add) (None, 5, 5, 160) 0 block_14_add[0][0]
block_15_project_BN[0][0]
__________________________________________________________________________________________________
block_16_expand (Conv2D) (None, 5, 5, 960) 153600 block_15_add[0][0]
__________________________________________________________________________________________________
block_16_expand_BN (BatchNormal (None, 5, 5, 960) 3840 block_16_expand[0][0]
__________________________________________________________________________________________________
block_16_expand_relu (ReLU) (None, 5, 5, 960) 0 block_16_expand_BN[0][0]
__________________________________________________________________________________________________
block_16_depthwise (DepthwiseCo (None, 5, 5, 960) 8640 block_16_expand_relu[0][0]
__________________________________________________________________________________________________
block_16_depthwise_BN (BatchNor (None, 5, 5, 960) 3840 block_16_depthwise[0][0]
__________________________________________________________________________________________________
block_16_depthwise_relu (ReLU) (None, 5, 5, 960) 0 block_16_depthwise_BN[0][0]
__________________________________________________________________________________________________
block_16_project (Conv2D) (None, 5, 5, 320) 307200 block_16_depthwise_relu[0][0]
__________________________________________________________________________________________________
block_16_project_BN (BatchNorma (None, 5, 5, 320) 1280 block_16_project[0][0]
__________________________________________________________________________________________________
Conv_1 (Conv2D) (None, 5, 5, 1280) 409600 block_16_project_BN[0][0]
__________________________________________________________________________________________________
Conv_1_bn (BatchNormalization) (None, 5, 5, 1280) 5120 Conv_1[0][0]
__________________________________________________________________________________________________
out_relu (ReLU) (None, 5, 5, 1280) 0 Conv_1_bn[0][0]
__________________________________________________________________________________________________
global_average_pooling2d_6 (Glo (None, 1280) 0 out_relu[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 32) 40992 global_average_pooling2d_6[0][0]
__________________________________________________________________________________________________
dropout (Dropout) (None, 32) 0 dense_2[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 1) 33 dropout[0][0]
==================================================================================================
Total params: 2,299,009
Trainable params: 41,025
Non-trainable params: 2,257,984
__________________________________________________________________________________________________
You are now ready to train the new model on the dogs vs cats data subset. We will use an EarlyStopping callback with patience set to 2 epochs, as before. Feel free to increase the training time if you wish.
# Train the model and save its training history
earlystopping = tf.keras.callbacks.EarlyStopping(patience = 2)
history_frozen_new_model = frozen_new_model.fit (
images_train, labels_train, epochs = 10, batch_size = 32,
validation_data = (images_valid, labels_valid), callbacks = [earlystopping]
)
Epoch 1/10 19/19 [==============================] - 2s 113ms/step - loss: 0.3416 - accuracy: 0.8517 - val_loss: 0.1403 - val_accuracy: 0.9533 Epoch 2/10 19/19 [==============================] - 1s 37ms/step - loss: 0.1475 - accuracy: 0.9467 - val_loss: 0.1113 - val_accuracy: 0.9533 Epoch 3/10 19/19 [==============================] - 1s 36ms/step - loss: 0.1071 - accuracy: 0.9600 - val_loss: 0.0994 - val_accuracy: 0.9500 Epoch 4/10 19/19 [==============================] - 1s 35ms/step - loss: 0.0863 - accuracy: 0.9767 - val_loss: 0.1020 - val_accuracy: 0.9567 Epoch 5/10 19/19 [==============================] - 1s 36ms/step - loss: 0.0920 - accuracy: 0.9667 - val_loss: 0.1084 - val_accuracy: 0.9533
# Run this cell to plot accuracy vs epoch and loss vs epoch
plt.figure(figsize = (15, 5))
plt.subplot(121)
try:
plt.plot(history_frozen_new_model.history['accuracy'])
plt.plot(history_frozen_new_model.history['val_accuracy'])
except KeyError:
plt.plot(history_frozen_new_model.history['acc'])
plt.plot(history_frozen_new_model.history['val_acc'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'lower right')
plt.subplot(122)
plt.plot(history_frozen_new_model.history['loss'])
plt.plot(history_frozen_new_model.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'upper right')
plt.show()
# Evaluate the benchmark model on the test set
new_model_test_loss, new_model_test_acc = frozen_new_model.evaluate(images_test, labels_test, verbose = 0)
print("Test loss: {}".format(new_model_test_loss))
print("Test accuracy: {}".format(new_model_test_acc))
Test loss: 0.10417315363883972 Test accuracy: 0.9566666483879089
Finally, we will look at the comparison of training, validation and test metrics between the benchmark and transfer learning model.
# Gather the benchmark and new model metrics
benchmark_train_loss = history_benchmark.history['loss'][-1]
benchmark_valid_loss = history_benchmark.history['val_loss'][-1]
try:
benchmark_train_acc = history_benchmark.history['acc'][-1]
benchmark_valid_acc = history_benchmark.history['val_acc'][-1]
except KeyError:
benchmark_train_acc = history_benchmark.history['accuracy'][-1]
benchmark_valid_acc = history_benchmark.history['val_accuracy'][-1]
new_model_train_loss = history_frozen_new_model.history['loss'][-1]
new_model_valid_loss = history_frozen_new_model.history['val_loss'][-1]
try:
new_model_train_acc = history_frozen_new_model.history['acc'][-1]
new_model_valid_acc = history_frozen_new_model.history['val_acc'][-1]
except KeyError:
new_model_train_acc = history_frozen_new_model.history['accuracy'][-1]
new_model_valid_acc = history_frozen_new_model.history['val_accuracy'][-1]
# Compile the metrics into a pandas DataFrame and display the table
comparison_table = pd.DataFrame (
[
['Training loss', benchmark_train_loss, new_model_train_loss],
['Training accuracy', benchmark_train_acc, new_model_train_acc],
['Validation loss', benchmark_valid_loss, new_model_valid_loss],
['Validation accuracy', benchmark_valid_acc, new_model_valid_acc],
['Test loss', benchmark_test_loss, new_model_test_loss],
['Test accuracy', benchmark_test_acc, new_model_test_acc]
],
columns = ['Metric', 'Benchmark CNN', 'Transfer learning CNN']
)
comparison_table.index = [''] * 6
comparison_table
| Metric | Benchmark CNN | Transfer learning CNN | |
|---|---|---|---|
| Training loss | 0.690837 | 0.092037 | |
| Training accuracy | 0.526667 | 0.966667 | |
| Validation loss | 0.698211 | 0.108419 | |
| Validation accuracy | 0.500000 | 0.953333 | |
| Test loss | 0.697630 | 0.104173 | |
| Test accuracy | 0.503333 | 0.956667 |
# Plot confusion matrices for benchmark and transfer learning models
plt.figure(figsize = (15, 5))
preds = benchmark_model.predict(images_test)
preds = (preds >= 0.5).astype(np.int32)
cm = confusion_matrix(labels_test, preds)
df_cm = pd.DataFrame(cm, index = ['Dog', 'Cat'], columns = ['Dog', 'Cat'])
plt.subplot(121)
plt.title("Confusion matrix for benchmark model\n")
sns.heatmap(df_cm, annot = True, fmt = "d", cmap = "YlGnBu")
plt.ylabel("Predicted")
plt.xlabel("Actual")
preds = frozen_new_model.predict(images_test)
preds = (preds >= 0.5).astype(np.int32)
cm = confusion_matrix(labels_test, preds)
df_cm = pd.DataFrame(cm, index = ['Dog', 'Cat'], columns = ['Dog', 'Cat'])
plt.subplot(122)
plt.title("Confusion matrix for transfer learning model\n")
sns.heatmap(df_cm, annot = True, fmt = "d", cmap = "YlGnBu")
plt.ylabel("Predicted")
plt.xlabel("Actual")
plt.show()
Congratulations for completing this programming assignment! In the next week of the course we will learn how to develop an effective data pipeline.
In this notebook, you will implement a data processing pipeline using tools from both Keras and the tf.data module. You will use the ImageDataGenerator class in the tf.keras module to feed a network with training and test images from a local directory containing a subset of the LSUN dataset, and train the model both with and without data augmentation. You will then use the map and filter functions of the Dataset class with the CIFAR-100 dataset to train a network to classify a processed subset of the images.
Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line:
#### GRADED CELL ####
Don't move or edit this first line - this is what the automatic grader looks for to recognise graded cells. These cells require you to write your own code to complete them, and are automatically graded when you submit the notebook. Don't edit the function name or signature provided in these cells, otherwise the automatic grader might not function properly. Inside these graded cells, you can use any functions or classes that are imported below, but make sure you don't use any variables that are outside the scope of the function.
Complete all the tasks you are asked for in the worksheet. When you have finished and are happy with your code, press the Submit Assignment button at the top of this notebook.
We'll start running some imports, and loading the dataset. Do not edit the existing imports in the following cell. If you would like to make further Tensorflow imports, you should add them here.
#### PACKAGE IMPORTS ####
# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook
import tensorflow as tf
from tensorflow.keras.datasets import cifar100
import numpy as np
import matplotlib.pyplot as plt
import json
%matplotlib inline
# If you would like to make further imports from tensorflow, add them here
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Input, Conv2D, MaxPool2D, Flatten, Dense, Dropout
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
|
|
|
In the first part of this assignment, you will use a subset of the LSUN dataset. This is a large-scale image dataset with 10 scene and 20 object categories. A subset of the LSUN dataset has been provided, and has already been split into training and test sets. The three classes included in the subset are church_outdoor, classroom and conference_room.
Your goal is to use the Keras preprocessing tools to construct a data ingestion and augmentation pipeline to train a neural network to classify the images into the three classes.
# Save the directory locations for the training, validation and test sets
train_dir = 'data/lsun/train'
valid_dir = 'data/lsun/valid'
test_dir = 'data/lsun/test'
You should first write a function that creates an ImageDataGenerator object, which rescales the image pixel values by a factor of 1/255.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def get_ImageDataGenerator():
"""
This function should return an instance of the ImageDataGenerator class.
This instance should be set up to rescale the data with the above scaling factor.
"""
return ImageDataGenerator(rescale = 1. / 255)
# Call the function to get an ImageDataGenerator as specified
image_gen = get_ImageDataGenerator()
You should now write a function that returns a generator object that will yield batches of images and labels from the training and test set directories. The generators should:
classroom $\rightarrow$ [1., 0., 0.]conference_room $\rightarrow$ [0., 1., 0.]church_outdoor $\rightarrow$ [0., 0., 1.]seed for shuffling (this should be passed into the flow_from_directory method).Hint: you may need to refer to the documentation for the ImageDataGenerator.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def get_generator(image_data_generator, directory, seed = None):
"""
This function takes an ImageDataGenerator object in the first argument and a
directory path in the second argument.
It should use the ImageDataGenerator to return a generator object according
to the above specifications.
The seed argument should be passed to the flow_from_directory method.
"""
return image_data_generator.flow_from_directory (
directory = directory, batch_size = 20, target_size = (64, 64),
classes = ['classroom', 'conference_room', 'church_outdoor'],
seed = seed
)
# Run this cell to define training and validation generators
train_generator = get_generator(image_gen, train_dir)
valid_generator = get_generator(image_gen, valid_dir)
Found 300 images belonging to 3 classes. Found 120 images belonging to 3 classes.
We are using a small subset of the dataset for demonstrative purposes in this assignment.
The following cell depends on your function get_generator to be implemented correctly. If it raises an error, go back and check the function specifications carefully.
# Display a few images and labels from the training set
batch = next(train_generator)
batch_images = np.array(batch[0])
batch_labels = np.array(batch[1])
lsun_classes = ['classroom', 'conference_room', 'church_outdoor']
plt.figure(figsize = (16, 10))
for i in range(20):
ax = plt.subplot(4, 5, i + 1)
plt.imshow(batch_images[i])
plt.title(lsun_classes[np.where(batch_labels[i] == 1.)[0][0]])
plt.axis('off')
# Reset the training generator
train_generator = get_generator(image_gen, train_dir)
Found 300 images belonging to 3 classes.
You will now build and compile a convolutional neural network classifier. Using the functional API, build your model according to the following specifications:
input_shape in the function argument to define the Input layer."SAME" padding and a ReLU activation function.In total, the network should have 8 layers. The model should then be compiled with the Adam optimizer with learning rate 0.0005, categorical cross entropy loss, and categorical accuracy metric.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def get_model(input_shape):
"""
This function should build and compile a CNN model according to the above specification,
using the functional API. Your function should return the model.
"""
inputs = Input(shape = input_shape)
h = Conv2D(filters = 8, kernel_size = (8, 8), activation = 'relu', padding = 'same')(inputs)
h = MaxPool2D(pool_size = (2, 2))(h)
h = Conv2D(filters = 4, kernel_size = (4, 4), activation = 'relu', padding = 'same')(h)
h = MaxPool2D(pool_size = (2, 2))(h)
h = Flatten()(h)
h = Dense(units = 16, activation = 'relu')(h)
outputs = Dense(units = 3, activation = 'softmax')(h)
model = Model(inputs = inputs, outputs = outputs)
model.compile (
optimizer = Adam(learning_rate = 5e-4),
loss = 'categorical_crossentropy',
metrics = ['accuracy']
)
return model
# Build and compile the model, print the model summary
lsun_model = get_model((64, 64, 3))
lsun_model.summary()
Model: "model" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 64, 64, 3)] 0 _________________________________________________________________ conv2d (Conv2D) (None, 64, 64, 8) 1544 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 32, 32, 8) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 32, 32, 4) 516 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 16, 16, 4) 0 _________________________________________________________________ flatten (Flatten) (None, 1024) 0 _________________________________________________________________ dense (Dense) (None, 16) 16400 _________________________________________________________________ dense_1 (Dense) (None, 3) 51 ================================================================= Total params: 18,511 Trainable params: 18,511 Non-trainable params: 0 _________________________________________________________________
You should now write a function to train the model for a specified number of epochs (specified in the epochs argument). The function takes a model argument, as well as train_gen and valid_gen arguments for the training and validation generators respectively, which you should use for training and validation data in the training run. You should also use the following callbacks:
EarlyStopping callback that monitors the validation accuracy and has patience set to 10. ReduceLROnPlateau callback that monitors the validation loss and has the factor set to 0.5 and minimum learning set to 0.0001Your function should return the training history.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def train_model(model, train_gen, valid_gen, epochs):
"""
This function should define the callback objects specified above, and then use the
train_gen and valid_gen generator object arguments to train the model for the (maximum)
number of epochs specified in the function argument, using the defined callbacks.
The function should return the training history.
"""
callbacks = [
EarlyStopping(monitor = 'val_accuracy', patience = 10),
ReduceLROnPlateau(monitor = 'val_loss', factor = 0.5, min_lr = 1e-4)
]
return model.fit_generator (
train_gen,
steps_per_epoch = train_gen.n // train_gen.batch_size,
validation_data = valid_gen,
validation_steps = valid_gen.n // valid_gen.batch_size,
epochs = epochs,
callbacks = callbacks
)
# Train the model for (maximum) 50 epochs
history = train_model(lsun_model, train_generator, valid_generator, epochs = 50)
WARNING:tensorflow:From <ipython-input-11-ccd41008632f>:23: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: Please use Model.fit, which supports generators. Epoch 1/50 15/15 [==============================] - 2s 146ms/step - loss: 1.0814 - accuracy: 0.4067 - val_loss: 1.0479 - val_accuracy: 0.5333 - lr: 5.0000e-04 Epoch 2/50 15/15 [==============================] - 2s 130ms/step - loss: 1.0026 - accuracy: 0.5200 - val_loss: 0.9453 - val_accuracy: 0.5417 - lr: 5.0000e-04 Epoch 3/50 15/15 [==============================] - 2s 125ms/step - loss: 0.8756 - accuracy: 0.5633 - val_loss: 0.8659 - val_accuracy: 0.5417 - lr: 5.0000e-04 Epoch 4/50 15/15 [==============================] - 2s 131ms/step - loss: 0.8371 - accuracy: 0.5633 - val_loss: 0.8616 - val_accuracy: 0.6500 - lr: 5.0000e-04 Epoch 5/50 15/15 [==============================] - 2s 152ms/step - loss: 0.7448 - accuracy: 0.6633 - val_loss: 0.8323 - val_accuracy: 0.5917 - lr: 5.0000e-04 Epoch 6/50 15/15 [==============================] - 2s 139ms/step - loss: 0.6990 - accuracy: 0.7133 - val_loss: 0.8257 - val_accuracy: 0.5833 - lr: 5.0000e-04 Epoch 7/50 15/15 [==============================] - 2s 130ms/step - loss: 0.7444 - accuracy: 0.6200 - val_loss: 0.7518 - val_accuracy: 0.6500 - lr: 5.0000e-04 Epoch 8/50 15/15 [==============================] - 2s 126ms/step - loss: 0.6756 - accuracy: 0.7200 - val_loss: 0.7254 - val_accuracy: 0.6583 - lr: 5.0000e-04 Epoch 9/50 15/15 [==============================] - 2s 131ms/step - loss: 0.6183 - accuracy: 0.7500 - val_loss: 0.6987 - val_accuracy: 0.7083 - lr: 5.0000e-04 Epoch 10/50 15/15 [==============================] - 2s 131ms/step - loss: 0.6197 - accuracy: 0.7267 - val_loss: 0.8660 - val_accuracy: 0.5833 - lr: 5.0000e-04 Epoch 11/50 15/15 [==============================] - 2s 126ms/step - loss: 0.5658 - accuracy: 0.7567 - val_loss: 0.7300 - val_accuracy: 0.5917 - lr: 5.0000e-04 Epoch 12/50 15/15 [==============================] - 2s 129ms/step - loss: 0.5731 - accuracy: 0.7833 - val_loss: 0.6971 - val_accuracy: 0.6417 - lr: 5.0000e-04 Epoch 13/50 15/15 [==============================] - 2s 130ms/step - loss: 0.5303 - accuracy: 0.7867 - val_loss: 0.7281 - val_accuracy: 0.6833 - lr: 5.0000e-04 Epoch 14/50 15/15 [==============================] - 2s 131ms/step - loss: 0.5356 - accuracy: 0.7800 - val_loss: 0.7112 - val_accuracy: 0.7417 - lr: 5.0000e-04 Epoch 15/50 15/15 [==============================] - 2s 124ms/step - loss: 0.5076 - accuracy: 0.7967 - val_loss: 0.8024 - val_accuracy: 0.6083 - lr: 5.0000e-04 Epoch 16/50 15/15 [==============================] - 2s 129ms/step - loss: 0.4730 - accuracy: 0.8167 - val_loss: 0.6715 - val_accuracy: 0.7333 - lr: 5.0000e-04 Epoch 17/50 15/15 [==============================] - 2s 130ms/step - loss: 0.4560 - accuracy: 0.8300 - val_loss: 0.7681 - val_accuracy: 0.6917 - lr: 5.0000e-04 Epoch 18/50 15/15 [==============================] - 2s 127ms/step - loss: 0.4314 - accuracy: 0.8433 - val_loss: 0.7340 - val_accuracy: 0.6917 - lr: 5.0000e-04 Epoch 19/50 15/15 [==============================] - 2s 131ms/step - loss: 0.4623 - accuracy: 0.8133 - val_loss: 0.7888 - val_accuracy: 0.5750 - lr: 5.0000e-04 Epoch 20/50 15/15 [==============================] - 2s 139ms/step - loss: 0.4177 - accuracy: 0.8267 - val_loss: 0.7551 - val_accuracy: 0.6250 - lr: 5.0000e-04 Epoch 21/50 15/15 [==============================] - 2s 138ms/step - loss: 0.4139 - accuracy: 0.8367 - val_loss: 0.7930 - val_accuracy: 0.6833 - lr: 5.0000e-04 Epoch 22/50 15/15 [==============================] - 2s 130ms/step - loss: 0.3751 - accuracy: 0.8633 - val_loss: 0.7657 - val_accuracy: 0.7167 - lr: 5.0000e-04 Epoch 23/50 15/15 [==============================] - 2s 128ms/step - loss: 0.3485 - accuracy: 0.8867 - val_loss: 0.6881 - val_accuracy: 0.7333 - lr: 5.0000e-04 Epoch 24/50 15/15 [==============================] - 2s 127ms/step - loss: 0.3430 - accuracy: 0.8767 - val_loss: 0.8482 - val_accuracy: 0.6167 - lr: 5.0000e-04
# Run this cell to plot accuracy vs epoch and loss vs epoch
plt.figure(figsize = (15, 5))
plt.subplot(121)
try:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
except KeyError:
try:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
except KeyError:
plt.plot(history.history['categorical_accuracy'])
plt.plot(history.history['val_categorical_accuracy'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'lower right')
plt.subplot(122)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'upper right')
plt.show()
You may notice overfitting in the above plots, through a growing discrepancy between the training and validation loss and accuracy. We will aim to mitigate this using data augmentation. Given our limited dataset, we may be able to improve the performance by applying random modifications to the images in the training data, effectively increasing the size of the dataset.
You should now write a function to create a new ImageDataGenerator object, which performs the following data preprocessing and augmentation:
Hint: you may need to refer to the documentation for the ImageDataGenerator.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def get_ImageDataGenerator_augmented():
"""
This function should return an instance of the ImageDataGenerator class
with the above specifications.
"""
return ImageDataGenerator (
rescale = 1. / 255,
rotation_range = 30,
brightness_range = (0.5, 1.5),
horizontal_flip = True
)
# Call the function to get an ImageDataGenerator as specified
image_gen_aug = get_ImageDataGenerator_augmented()
# Run this cell to define training and validation generators
valid_generator_aug = get_generator(image_gen_aug, valid_dir)
train_generator_aug = get_generator(image_gen_aug, train_dir, seed = 10)
Found 120 images belonging to 3 classes. Found 300 images belonging to 3 classes.
# Reset the original train_generator with the same random seed
train_generator = get_generator(image_gen, train_dir, seed = 10)
Found 300 images belonging to 3 classes.
The following cell depends on your function get_generator to be implemented correctly. If it raises an error, go back and check the function specifications carefully.
The cell will display augmented and non-augmented images (and labels) from the training dataset, using the train_generator_aug and train_generator objects defined above (if the images do not correspond to each other, check you have implemented the seed argument correctly).
# Display a few images and labels from the non-augmented and augmented generators
batch = next(train_generator)
batch_images = np.array(batch[0])
batch_labels = np.array(batch[1])
aug_batch = next(train_generator_aug)
aug_batch_images = np.array(aug_batch[0])
aug_batch_labels = np.array(aug_batch[1])
plt.figure(figsize = (16, 5))
plt.suptitle("Unaugmented images", fontsize = 16)
for n, i in enumerate(np.arange(10)):
ax = plt.subplot(2, 5, n + 1)
plt.imshow(batch_images[i])
plt.title(lsun_classes[np.where(batch_labels[i] == 1.)[0][0]])
plt.axis('off')
plt.figure(figsize=(16,5))
plt.suptitle("Augmented images", fontsize = 16)
for n, i in enumerate(np.arange(10)):
ax = plt.subplot(2, 5, n + 1)
plt.imshow(aug_batch_images[i])
plt.title(lsun_classes[np.where(aug_batch_labels[i] == 1.)[0][0]])
plt.axis('off')
# Reset the augmented data generator
train_generator_aug = get_generator(image_gen_aug, train_dir)
Found 300 images belonging to 3 classes.
# Build and compile a new model
lsun_new_model = get_model((64, 64, 3))
# Train the model
history_augmented = train_model(lsun_new_model, train_generator_aug, valid_generator_aug, epochs = 50)
Epoch 1/50 15/15 [==============================] - 3s 170ms/step - loss: 1.1252 - accuracy: 0.3700 - val_loss: 1.0906 - val_accuracy: 0.3500 - lr: 5.0000e-04 Epoch 2/50 15/15 [==============================] - 2s 157ms/step - loss: 1.0838 - accuracy: 0.3733 - val_loss: 1.0862 - val_accuracy: 0.4083 - lr: 5.0000e-04 Epoch 3/50 15/15 [==============================] - 2s 164ms/step - loss: 1.0652 - accuracy: 0.4367 - val_loss: 1.0769 - val_accuracy: 0.4667 - lr: 5.0000e-04 Epoch 4/50 15/15 [==============================] - 2s 154ms/step - loss: 1.0268 - accuracy: 0.5367 - val_loss: 1.0455 - val_accuracy: 0.4500 - lr: 5.0000e-04 Epoch 5/50 15/15 [==============================] - 2s 153ms/step - loss: 0.9739 - accuracy: 0.5533 - val_loss: 0.9943 - val_accuracy: 0.5167 - lr: 5.0000e-04 Epoch 6/50 15/15 [==============================] - 2s 150ms/step - loss: 0.8833 - accuracy: 0.6267 - val_loss: 0.9422 - val_accuracy: 0.5500 - lr: 5.0000e-04 Epoch 7/50 15/15 [==============================] - 2s 158ms/step - loss: 0.8571 - accuracy: 0.6000 - val_loss: 0.8214 - val_accuracy: 0.6167 - lr: 5.0000e-04 Epoch 8/50 15/15 [==============================] - 2s 164ms/step - loss: 0.8081 - accuracy: 0.6333 - val_loss: 0.8473 - val_accuracy: 0.5750 - lr: 5.0000e-04 Epoch 9/50 15/15 [==============================] - 3s 171ms/step - loss: 0.8671 - accuracy: 0.6000 - val_loss: 1.0021 - val_accuracy: 0.5667 - lr: 5.0000e-04 Epoch 10/50 15/15 [==============================] - 2s 157ms/step - loss: 0.8379 - accuracy: 0.6133 - val_loss: 0.7466 - val_accuracy: 0.6750 - lr: 5.0000e-04 Epoch 11/50 15/15 [==============================] - 2s 155ms/step - loss: 0.7689 - accuracy: 0.6633 - val_loss: 0.8100 - val_accuracy: 0.6250 - lr: 5.0000e-04 Epoch 12/50 15/15 [==============================] - 2s 157ms/step - loss: 0.7453 - accuracy: 0.6767 - val_loss: 0.8221 - val_accuracy: 0.5833 - lr: 5.0000e-04 Epoch 13/50 15/15 [==============================] - 2s 159ms/step - loss: 0.7786 - accuracy: 0.6467 - val_loss: 0.7862 - val_accuracy: 0.6583 - lr: 5.0000e-04 Epoch 14/50 15/15 [==============================] - 2s 159ms/step - loss: 0.7317 - accuracy: 0.6800 - val_loss: 0.7873 - val_accuracy: 0.6417 - lr: 5.0000e-04 Epoch 15/50 15/15 [==============================] - 2s 158ms/step - loss: 0.7223 - accuracy: 0.7000 - val_loss: 0.7223 - val_accuracy: 0.6750 - lr: 5.0000e-04 Epoch 16/50 15/15 [==============================] - 2s 158ms/step - loss: 0.7115 - accuracy: 0.6867 - val_loss: 0.7492 - val_accuracy: 0.6333 - lr: 5.0000e-04 Epoch 17/50 15/15 [==============================] - 2s 161ms/step - loss: 0.6875 - accuracy: 0.7200 - val_loss: 0.6983 - val_accuracy: 0.7167 - lr: 5.0000e-04 Epoch 18/50 15/15 [==============================] - 2s 159ms/step - loss: 0.7255 - accuracy: 0.6633 - val_loss: 0.7704 - val_accuracy: 0.6667 - lr: 5.0000e-04 Epoch 19/50 15/15 [==============================] - 2s 153ms/step - loss: 0.6569 - accuracy: 0.7300 - val_loss: 0.7671 - val_accuracy: 0.6917 - lr: 5.0000e-04 Epoch 20/50 15/15 [==============================] - 2s 158ms/step - loss: 0.6997 - accuracy: 0.7100 - val_loss: 0.6996 - val_accuracy: 0.7000 - lr: 5.0000e-04 Epoch 21/50 15/15 [==============================] - 2s 157ms/step - loss: 0.6489 - accuracy: 0.7267 - val_loss: 0.8197 - val_accuracy: 0.6250 - lr: 5.0000e-04 Epoch 22/50 15/15 [==============================] - 2s 161ms/step - loss: 0.6237 - accuracy: 0.7367 - val_loss: 0.6889 - val_accuracy: 0.6667 - lr: 5.0000e-04 Epoch 23/50 15/15 [==============================] - 2s 158ms/step - loss: 0.6434 - accuracy: 0.7300 - val_loss: 0.8957 - val_accuracy: 0.6000 - lr: 5.0000e-04 Epoch 24/50 15/15 [==============================] - 2s 155ms/step - loss: 0.6750 - accuracy: 0.7167 - val_loss: 0.7749 - val_accuracy: 0.6500 - lr: 5.0000e-04 Epoch 25/50 15/15 [==============================] - 2s 149ms/step - loss: 0.6811 - accuracy: 0.6900 - val_loss: 0.7737 - val_accuracy: 0.6500 - lr: 5.0000e-04 Epoch 26/50 15/15 [==============================] - 2s 155ms/step - loss: 0.6417 - accuracy: 0.7467 - val_loss: 0.8007 - val_accuracy: 0.6250 - lr: 5.0000e-04 Epoch 27/50 15/15 [==============================] - 2s 154ms/step - loss: 0.6394 - accuracy: 0.7300 - val_loss: 0.7682 - val_accuracy: 0.6583 - lr: 5.0000e-04
# Run this cell to plot accuracy vs epoch and loss vs epoch
plt.figure(figsize = (15, 5))
plt.subplot(121)
try:
plt.plot(history_augmented.history['accuracy'])
plt.plot(history_augmented.history['val_accuracy'])
except KeyError:
try:
plt.plot(history_augmented.history['acc'])
plt.plot(history_augmented.history['val_acc'])
except KeyError:
plt.plot(history_augmented.history['categorical_accuracy'])
plt.plot(history_augmented.history['val_categorical_accuracy'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'lower right')
plt.subplot(122)
plt.plot(history_augmented.history['loss'])
plt.plot(history_augmented.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'upper right')
plt.show()
Do you see an improvement in the overfitting? This will of course vary based on your particular run and whether you have altered the hyperparameters.
# Get model predictions for the first 3 batches of test data
num_batches = 3
seed = 25
test_generator = get_generator(image_gen_aug, test_dir, seed = seed)
predictions = lsun_new_model.predict_generator(test_generator, steps = num_batches)
Found 300 images belonging to 3 classes. WARNING:tensorflow:From <ipython-input-23-21bbcaef6361>:5: Model.predict_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version. Instructions for updating: Please use Model.predict, which supports generators.
# Run this cell to view randomly selected images and model predictions
# Get images and ground truth labels
test_generator = get_generator(image_gen_aug, test_dir, seed = seed)
batches = []
for i in range(num_batches):
batches.append(next(test_generator))
batch_images = np.vstack([b[0] for b in batches])
batch_labels = np.concatenate([b[1].astype(np.int32) for b in batches])
# Randomly select images from the batch
inx = np.random.choice(predictions.shape[0], 4, replace = False)
print(inx)
fig, axes = plt.subplots(4, 2, figsize = (16, 12))
fig.subplots_adjust(hspace = 0.4, wspace = -0.2)
for n, i in enumerate(inx):
axes[n, 0].imshow(batch_images[i])
axes[n, 0].get_xaxis().set_visible(False)
axes[n, 0].get_yaxis().set_visible(False)
axes[n, 0].text (
30., -3.5, lsun_classes[np.where(batch_labels[i] == 1.)[0][0]],
horizontalalignment = 'center'
)
axes[n, 1].bar(np.arange(len(predictions[i])), predictions[i])
axes[n, 1].set_xticks(np.arange(len(predictions[i])))
axes[n, 1].set_xticklabels(lsun_classes)
axes[n, 1].set_title (
f"Categorical distribution. Model prediction: {lsun_classes[np.argmax(predictions[i])]}"
)
plt.show()
Found 300 images belonging to 3 classes. [26 14 27 55]
Congratulations! This completes the first part of the programming assignment using the tf.keras image data processing tools.
In the second part of this assignment, you will use the CIFAR-100 dataset. This image dataset has 100 classes with 500 training images and 100 test images per class.
Your goal is to use the tf.data module preprocessing tools to construct a data ingestion pipeline including filtering and function mapping over the dataset to train a neural network to classify the images.
# Load the data, along with the labels
(train_data, train_labels), (test_data, test_labels) = cifar100.load_data(label_mode = 'fine')
with open('data/cifar100/cifar100_labels.json', 'r') as j:
cifar_labels = json.load(j)
# Display a few images and labels
plt.figure(figsize = (15, 8))
inx = np.random.choice(train_data.shape[0], 32, replace = False)
for n, i in enumerate(inx):
ax = plt.subplot(4, 8, n + 1)
plt.imshow(train_data[i])
plt.title(cifar_labels[int(train_labels[i])])
plt.axis('off')
You should now write a function to create a tf.data.Dataset object for each of the training and test images and labels. This function should take a numpy array of images in the first argument and a numpy array of labels in the second argument, and create a Dataset object.
Your function should then return the Dataset object. Do not batch or shuffle the Dataset (this will be done later).
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def create_dataset(data, labels):
"""
This function takes a numpy array batch of images in the first argument, and
a corresponding array containing the labels in the second argument.
The function should then create a tf.data.Dataset object with these inputs
and outputs, and return it.
"""
return tf.data.Dataset.from_tensor_slices(tensors = (data, labels))
# Run the below cell to convert the training and test data and labels into datasets
train_dataset = create_dataset(train_data, train_labels)
test_dataset = create_dataset(test_data, test_labels)
# Check the element_spec of your datasets
print(train_dataset.element_spec)
print(test_dataset.element_spec)
(TensorSpec(shape=(32, 32, 3), dtype=tf.uint8, name=None), TensorSpec(shape=(1,), dtype=tf.int64, name=None)) (TensorSpec(shape=(32, 32, 3), dtype=tf.uint8, name=None), TensorSpec(shape=(1,), dtype=tf.int64, name=None))
Write a function to filter the train and test datasets so that they only generate images that belong to a specified set of classes.
The function should take a Dataset object in the first argument, and a list of integer class indices in the second argument. Inside your function you should define an auxiliary function that you will use with the filter method of the Dataset object. This auxiliary function should take image and label arguments (as in the element_spec) for a single element in the batch, and return a boolean indicating if the label is one of the allowed classes.
Your function should then return the filtered dataset.
Hint: you may need to use the tf.equal, tf.cast and tf.math.reduce_any functions in your auxiliary function.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def filter_classes(dataset, classes):
"""
This function should filter the dataset by only retaining dataset elements whose
label belongs to one of the integers in the classes list.
The function should then return the filtered Dataset object.
"""
labels = tf.constant(classes, dtype = tf.int64)
return dataset.filter(lambda _, label: tf.reduce_any(tf.equal(label, labels)))
# Run the below cell to filter the datasets using your function
cifar_classes = [0, 29, 99] # Your datasets should contain only classes in this list
train_dataset = filter_classes(train_dataset, cifar_classes)
test_dataset = filter_classes(test_dataset, cifar_classes)
You should now write two functions that use the map method to process the images and labels in the filtered dataset.
The first function should one-hot encode the remaining labels so that we can train the network using a categorical cross entropy loss.
The function should take a Dataset object as an argument. Inside your function you should define an auxiliary function that you will use with the map method of the Dataset object. This auxiliary function should take image and label arguments (as in the element_spec) for a single element in the batch, and return a tuple of two elements, with the unmodified image in the first element, and a one-hot vector in the second element. The labels should be encoded according to the following:
[1., 0., 0.][0., 1., 0.][0., 0., 1.]Your function should then return the mapped dataset.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def map_labels(dataset):
"""
This function should map over the dataset to convert the label to a
one-hot vector. The encoding should be done according to the above specification.
The function should then return the mapped Dataset object.
"""
return dataset.map (
lambda image, label: (
image,
tf.one_hot(tf.argmax(tf.cast(tf.equal(label, cifar_classes), dtype = tf.int32)), 3)
)
)
# Run the below cell to one-hot encode the training and test labels.
train_dataset = map_labels(train_dataset)
test_dataset = map_labels(test_dataset)
The second function should process the images according to the following specification:
The function should take a Dataset object as an argument. Inside your function you should again define an auxiliary function that you will use with the map method of the Dataset object. This auxiliary function should take image and label arguments (as in the element_spec) for a single element in the batch, and return a tuple of two elements, with the processed image in the first element, and the unmodified label in the second argument.
Your function should then return the mapped dataset.
Hint: you may find it useful to use tf.reduce_mean since the black and white image is the colour-average of the colour images. You can also use the keepdims keyword in tf.reduce_mean to retain the single colour channel.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def map_images(dataset):
"""
This function should map over the dataset to process the image according to the
above specification. The function should then return the mapped Dataset object.
"""
return dataset.map (
lambda image, label: (
tf.reduce_mean (
tf.divide(image, tf.constant(255, dtype = tf.uint8)),
axis = -1,
keepdims = True
),
label
)
)
# Run the below cell to apply your mapping function to the datasets
train_dataset_bw = map_images(train_dataset)
test_dataset_bw = map_images(test_dataset)
# Run this cell to view a selection of images before and after processing
plt.figure(figsize = (16, 5))
plt.suptitle("Unprocessed images", fontsize = 16)
for n, elem in enumerate(train_dataset.take(10)):
images, labels = elem
ax = plt.subplot(2, 5, n + 1)
plt.title(cifar_labels[cifar_classes[np.where(labels == 1.)[0][0]]])
plt.imshow(np.squeeze(images), cmap = 'gray')
plt.axis('off')
plt.figure(figsize = (16, 5))
plt.suptitle("Processed images", fontsize = 16)
for n, elem in enumerate(train_dataset_bw.take(10)):
images_bw, labels_bw = elem
ax = plt.subplot(2, 5, n + 1)
plt.title(cifar_labels[cifar_classes[np.where(labels_bw == 1.)[0][0]]])
plt.imshow(np.squeeze(images_bw), cmap = 'gray')
plt.axis('off')
We will now batch and shuffle the Dataset objects.
# Run the below cell to batch the training dataset and expand the final dimensinos
train_dataset_bw = train_dataset_bw.batch(10)
train_dataset_bw = train_dataset_bw.shuffle(100)
test_dataset_bw = test_dataset_bw.batch(10)
test_dataset_bw = test_dataset_bw.shuffle(100)
Now we will train a model using the Dataset objects. We will use the model specification and function from the first part of this assignment, only modifying the size of the input images.
# Build and compile a new model with our original spec, using the new image size
cifar_model = get_model((32, 32, 1))
# Train the model for 15 epochs
history = cifar_model.fit(train_dataset_bw, validation_data=test_dataset_bw, epochs = 15)
Epoch 1/15 150/150 [==============================] - 5s 31ms/step - loss: 1.0657 - accuracy: 0.4133 - val_loss: 0.9956 - val_accuracy: 0.4567 Epoch 2/15 150/150 [==============================] - 5s 30ms/step - loss: 0.9129 - accuracy: 0.5773 - val_loss: 0.8054 - val_accuracy: 0.6733 Epoch 3/15 150/150 [==============================] - 5s 31ms/step - loss: 0.7924 - accuracy: 0.6480 - val_loss: 0.8389 - val_accuracy: 0.6133 Epoch 4/15 150/150 [==============================] - 5s 31ms/step - loss: 0.7383 - accuracy: 0.6980 - val_loss: 0.7123 - val_accuracy: 0.7200 Epoch 5/15 150/150 [==============================] - 5s 31ms/step - loss: 0.6821 - accuracy: 0.7467 - val_loss: 0.7101 - val_accuracy: 0.7367 Epoch 6/15 150/150 [==============================] - 5s 31ms/step - loss: 0.6519 - accuracy: 0.7447 - val_loss: 0.6943 - val_accuracy: 0.7300 Epoch 7/15 150/150 [==============================] - 5s 31ms/step - loss: 0.6269 - accuracy: 0.7520 - val_loss: 0.6345 - val_accuracy: 0.7600 Epoch 8/15 150/150 [==============================] - 5s 31ms/step - loss: 0.5944 - accuracy: 0.7760 - val_loss: 0.6470 - val_accuracy: 0.7600 Epoch 9/15 150/150 [==============================] - 5s 31ms/step - loss: 0.5732 - accuracy: 0.7813 - val_loss: 0.6070 - val_accuracy: 0.7667 Epoch 10/15 150/150 [==============================] - 5s 31ms/step - loss: 0.5483 - accuracy: 0.7933 - val_loss: 0.5959 - val_accuracy: 0.7900 Epoch 11/15 150/150 [==============================] - 5s 31ms/step - loss: 0.5328 - accuracy: 0.8040 - val_loss: 0.6205 - val_accuracy: 0.7700 Epoch 12/15 150/150 [==============================] - 5s 31ms/step - loss: 0.5169 - accuracy: 0.8133 - val_loss: 0.5932 - val_accuracy: 0.7867 Epoch 13/15 150/150 [==============================] - 5s 31ms/step - loss: 0.5037 - accuracy: 0.8127 - val_loss: 0.5814 - val_accuracy: 0.7933 Epoch 14/15 150/150 [==============================] - 5s 31ms/step - loss: 0.4793 - accuracy: 0.8147 - val_loss: 0.6173 - val_accuracy: 0.7767 Epoch 15/15 150/150 [==============================] - 5s 31ms/step - loss: 0.4975 - accuracy: 0.8153 - val_loss: 0.6596 - val_accuracy: 0.7533
# Run this cell to plot accuracy vs epoch and loss vs epoch
plt.figure(figsize = (15, 5))
plt.subplot(121)
try:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
except KeyError:
try:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
except KeyError:
plt.plot(history.history['categorical_accuracy'])
plt.plot(history.history['val_categorical_accuracy'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'lower right')
plt.subplot(122)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Training', 'Validation'], loc = 'upper right')
plt.show()
# Create an iterable from the batched test dataset
test_dataset = test_dataset.batch(10)
iter_test_dataset = iter(test_dataset)
# Display model predictions for a sample of test images
plt.figure(figsize = (15, 8))
inx = np.random.choice(test_data.shape[0], 18, replace = False)
images, labels = next(iter_test_dataset)
probs = cifar_model(tf.reduce_mean(tf.cast(images, tf.float32), axis = -1, keepdims = True) / 255.)
preds = np.argmax(probs, axis = 1)
for n in range(10):
ax = plt.subplot(2, 5, n + 1)
plt.imshow(images[n])
plt.title(cifar_labels[cifar_classes[np.where(labels[n].numpy() == 1.0)[0][0]]])
plt.text(0, 35, "Model prediction: {}".format(cifar_labels[cifar_classes[preds[n]]]))
plt.axis('off')
Congratulations for completing this programming assignment! In the next week of the course we will learn to develop models for sequential data.
In this notebook, you will use the text preprocessing tools and RNN models to build a character-level language model. You will then train your model on the works of Shakespeare, and use the network to generate your own text..
Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line:
#### GRADED CELL ####
Don't move or edit this first line - this is what the automatic grader looks for to recognise graded cells. These cells require you to write your own code to complete them, and are automatically graded when you submit the notebook. Don't edit the function name or signature provided in these cells, otherwise the automatic grader might not function properly. Inside these graded cells, you can use any functions or classes that are imported below, but make sure you don't use any variables that are outside the scope of the function.
Complete all the tasks you are asked for in the worksheet. When you have finished and are happy with your code, press the Submit Assignment button at the top of this notebook.
We'll start running some imports, and loading the dataset. Do not edit the existing imports in the following cell. If you would like to make further Tensorflow imports, you should add them here.
#### PACKAGE IMPORTS ####
# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook
import tensorflow as tf
import numpy as np
import json
import matplotlib.pyplot as plt
%matplotlib inline
# If you would like to make further imports from tensorflow, add them here
from tensorflow.keras.layers import Embedding, GRU, Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

In this assignment, you will use a subset of the Shakespeare dataset. It consists of a single text file with several excerpts concatenated together. The data is in raw text form, and so far has not yet had any preprocessing.
Your goal is to construct an unsupervised character-level sequence model that can generate text according to a distribution learned from the dataset.
# Load the text file into a string
with open('data/Shakespeare.txt', 'r', encoding = 'utf-8') as file:
text = file.read()
# Create a list of chunks of text
text_chunks = text.split('.')
To give you a feel for what the text looks like, we will print a few chunks from the list.
# Display some randomly selected text samples
num_samples = 5
inx = np.random.choice(len(text_chunks), num_samples, replace = False)
for chunk in np.array(text_chunks)[inx]:
print(chunk)
Give me your hand: I'll privily away COMINIUS: Our spoils he kick'd at, And look'd upon things precious as they were The common muck of the world: he covets less Than misery itself would give; rewards His deeds with doing them, and is content To spend the time to end it CAMILLO: I think, this coming summer, the King of Sicilia means to pay Bohemia the visitation which he justly owes him TRANIO: Faith, he is gone unto the taming-school WARWICK: Your grace hath still been famed for virtuous; And now may seem as wise as virtuous, By spying and avoiding fortune's malice, For few men rightly temper with the stars: Yet in this one thing let me blame your grace, For choosing me when Clarence is in place
You should now write a function that returns a Tokenizer object. The function takes a list of strings as an argument, and should create a Tokenizer according to the following specification:
The Tokenizer should be fit to the list_of_strings argument and returned by the function.
Hint: you may need to refer to the documentation for the Tokenizer.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def create_character_tokenizer(list_of_strings):
"""
This function takes a list of strings as its argument. It should create
and return a Tokenizer according to the above specifications.
"""
tokenizer = Tokenizer(char_level = True, filters = None, lower = False)
tokenizer.fit_on_texts(list_of_strings)
return tokenizer
# Get the tokenizer
tokenizer = create_character_tokenizer(text_chunks)
You should now write a function to use the tokenizer to map each string in text_chunks to its corresponding encoded sequence. The following function takes a fitted Tokenizer object in the first argument (as returned by create_character_tokenizer) and a list of strings in the second argument. The function should return a list of lists, where each sublist is a sequence of integer tokens encoding the text sequences according to the mapping stored in the tokenizer.
Hint: you may need to refer to the documentation for the Tokenizer.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def strings_to_sequences(tokenizer, list_of_strings):
"""
This function takes a tokenizer object and a list of strings as its arguments.
It should use the tokenizer to map the text chunks to sequences of tokens and
then return this list of encoded sequences.
"""
return tokenizer.texts_to_sequences(list_of_strings)
# Encode the text chunks into tokens
seq_chunks = strings_to_sequences(tokenizer, text_chunks)
Since not all of the text chunks are the same length, you will need to pad them in order to train on batches. You should now complete the following function, which takes the list of lists of tokens, and creates a single numpy array with the token sequences in the rows, according to the following specification:
The function should then return the resulting numpy array.
Hint: you may want to refer to the documentation for the pad_sequences function.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def make_padded_dataset(sequence_chunks):
"""
This function takes a list of lists of tokenized sequences, and transforms
them into a 2D numpy array, padding the sequences as necessary according to
the above specification. The function should then return the numpy array.
"""
return pad_sequences(sequence_chunks, maxlen = 500, padding = 'pre', truncating = 'pre')
# Pad the token sequence chunks and get the numpy array
padded_sequences = make_padded_dataset(seq_chunks)
Now you are ready to build your RNN model. The model will receive a sequence of characters and predict the next character in the sequence. At training time, the model can be passed an input sequence, with the target sequence is shifted by one.
For example, the expression To be or not to be appears in Shakespeare's play 'Hamlet'. Given input To be or not to b, the correct prediction is o be or not to be. Notice that the prediction is the same length as the input!

You should now write the following function to create an input and target array from the current padded_sequences array. The function has a single argument that is a 2D numpy array of shape (num_examples, max_seq_len). It should fulfil the following specification:
(num_examples, max_seq_len - 1).max_seq_len - 1 tokens of each sequence. max_seq_len - 1 tokens of each sequence. The function should then return the tuple (input_array, output_array). Note that it is possible to complete this function using numpy indexing alone!
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def create_inputs_and_targets(array_of_sequences):
"""
This function takes a 2D numpy array of token sequences, and returns a tuple of two
elements: the first element is the input array and the second element is the output
array, which are defined according to the above specification.
"""
return array_of_sequences[:, :-1], array_of_sequences[:, 1:]
# Create the input and output arrays
input_seq, target_seq = create_inputs_and_targets(padded_sequences)
We will build our RNN language model to be stateful, so that the internal state of the RNN will be maintained across batches. For this to be effective, we need to make sure that each element of every batch follows on from the corresponding element of the preceding batch (you may want to look back at the "Stateful RNNs" reading notebook earlier in the week).
The following code processes the input and output sequence arrays so that they are ready to be split into batches for training a stateful RNN, by re-ordering the sequence examples (the rows) according to a specified batch size.
# Fix the batch size for training
batch_size = 32
# Prepare input and output arrays for training the stateful RNN
num_examples = input_seq.shape[0]
num_processed_examples = num_examples - (num_examples % batch_size)
input_seq = input_seq[:num_processed_examples]
target_seq = target_seq[:num_processed_examples]
steps = int(num_processed_examples / 32) # steps per epoch
inx = np.empty((0,), dtype=np.int32)
for i in range(steps):
inx = np.concatenate((inx, i + np.arange(0, num_processed_examples, steps)))
input_seq_stateful = input_seq[inx]
target_seq_stateful = target_seq[inx]
We will set aside approximately 20% of the data for validation.
# Create the training and validation splits
num_train_examples = int(batch_size * ((0.8 * num_processed_examples) // batch_size))
input_train = input_seq_stateful[:num_train_examples]
target_train = target_seq_stateful[:num_train_examples]
input_valid = input_seq_stateful[num_train_examples:]
target_valid = target_seq_stateful[num_train_examples:]
You should now write a function to take the training and validation input and target arrays, and create training and validation tf.data.Dataset objects. The function takes an input array and target array in the first two arguments, and the batch size in the third argument. Your function should do the following:
Dataset using the from_tensor_slices static method, passing in a tuple of the input and output numpy arrays.Dataset using the batch_size argument, setting drop_remainder to True. The function should then return the Dataset object.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def make_Dataset(input_array, target_array, batch_size):
"""
This function takes two 2D numpy arrays in the first two arguments, and an integer
batch_size in the third argument. It should create and return a Dataset object
using the two numpy arrays and batch size according to the above specification.
"""
dataset = tf.data.Dataset.from_tensor_slices(tensors = (input_array, target_array))
return dataset.batch(batch_size = batch_size, drop_remainder = True)
# Create the training and validation Datasets
train_data = make_Dataset(input_train, target_train, batch_size)
valid_data = make_Dataset(input_valid, target_valid, batch_size)
You are now ready to build your RNN character-level language model. You should write the following function to build the model; the function takes arguments for the batch size and vocabulary size (number of tokens). Using the Sequential API, your function should build your model according to the following specifications:
vocab_size from the function argument.batch_input_shape to (batch_size, None) (a fixed batch size is required for stateful RNNs).vocab_size units and no activation function.In total, the network should have 3 layers.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def get_model(vocab_size, batch_size):
"""
This function takes a vocabulary size and batch size, and builds and returns a
Sequential model according to the above specification.
"""
return Sequential ([
Embedding (
input_dim = vocab_size, output_dim = 256,
mask_zero = True, batch_input_shape = (batch_size, None)
),
GRU(units = 1024, stateful = True, return_sequences = True),
Dense(units = vocab_size),
])
# Build the model and print the model summary
model = get_model(len(tokenizer.word_index) + 1, batch_size)
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= embedding (Embedding) (32, None, 256) 16640 _________________________________________________________________ gru (GRU) (32, None, 1024) 3938304 _________________________________________________________________ dense (Dense) (32, None, 65) 66625 ================================================================= Total params: 4,021,569 Trainable params: 4,021,569 Non-trainable params: 0 _________________________________________________________________
You are now ready to compile and train the model. For this model and dataset, the training time is very long. Therefore for this assignment it is not a requirement to train the model. We have pre-trained a model for you (using the code below) and saved the model weights, which can be loaded to get the model predictions.
It is recommended to use accelerator hardware (e.g. using Colab) when training this model. It would also be beneficial to increase the size of the model, e.g. by stacking extra recurrent layers.
# Choose whether to train a new model or load the pre-trained model
skip_training = True
# Compile and train the model, or load pre-trained weights
if not skip_training:
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint (
filepath = './models/ckpt',
save_weights_only = True,
save_best_only = True
)
model.compile (
optimizer = 'adam',
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits = True),
metrics = ['sparse_categorical_accuracy']
)
history = model.fit (
train_data, epochs = 15, validation_data = valid_data,
validation_steps = 50, callbacks = [checkpoint_callback]
)
# Save model history as a json file, or load it if using pre-trained weights
if not skip_training:
history_dict = dict()
for k, v in history.history.items():
history_dict[k] = [float(val) for val in history.history[k]]
with open('models/history.json', 'w+') as json_file:
json.dump(history_dict, json_file, sort_keys = True, indent = 4)
else:
with open('models/history.json', 'r') as json_file:
history_dict = json.load(json_file)
# Run this cell to plot accuracy vs epoch and loss vs epoch
plt.figure(figsize = (15, 5))
plt.subplot(121)
plt.plot(history_dict['sparse_categorical_accuracy'])
plt.plot(history_dict['val_sparse_categorical_accuracy'])
plt.title('Accuracy vs. epochs')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.xticks(np.arange(len(history_dict['sparse_categorical_accuracy'])))
ax = plt.gca()
ax.set_xticklabels(1 + np.arange(len(history_dict['sparse_categorical_accuracy'])))
plt.legend(['Training', 'Validation'], loc = 'lower right')
plt.subplot(122)
plt.plot(history_dict['loss'])
plt.plot(history_dict['val_loss'])
plt.title('Loss vs. epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.xticks(np.arange(len(history_dict['sparse_categorical_accuracy'])))
ax = plt.gca()
ax.set_xticklabels(1 + np.arange(len(history_dict['sparse_categorical_accuracy'])))
plt.legend(['Training', 'Validation'], loc = 'upper right')
plt.show()
You can now use the model to generate text! In order to generate a single text sequence, the model needs to be rebuilt with a batch size of 1.
# Re-build the model and load the saved weights
model = get_model(len(tokenizer.word_index) + 1, batch_size = 1)
model.load_weights(tf.train.latest_checkpoint('./models/'))
WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program. Two checkpoint references resolved to different objects (<tensorflow.python.keras.layers.embeddings.Embedding object at 0x7f642060a090> and <tensorflow.python.keras.layers.recurrent_v2.GRU object at 0x7f642060a4d0>). WARNING:tensorflow:Inconsistent references when loading the checkpoint into this object graph. Either the Trackable object references in the Python program have changed in an incompatible way, or the checkpoint was generated in an incompatible program. Two checkpoint references resolved to different objects (<tensorflow.python.keras.layers.recurrent_v2.GRU object at 0x7f642060a4d0> and <tensorflow.python.keras.layers.core.Dense object at 0x7f642060ad90>).
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f63fc58d210>
An algorithm to generate text is as follows:
'ROMEO:') to get the network started, and a define number of characters for the model to generate, num_generation_steps.Repeat the following for num_generation_steps - 1 steps:
Take the final list of tokens and convert to text using the Tokenizer.
Note that the internal state of the recurrent layer can be accessed using the states property. For the GRU layer, it is a list of one variable:
# Inspect the model's current recurrent state
model.layers[1].states
[<tf.Variable 'gru_1/Variable:0' shape=(1, 1024) dtype=float32, numpy=array([[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)>]
We will break the algorithm down into two steps. First, you should now complete the following function that takes a sequence of tokens of any length and returns the model's prediction (the logits) for the last time step. The specification is as follows:
[[1, 2, 3, 4]]initial_state is None, then the function should reset the state of the recurrent layer to zeros.initial_state is a 2D Tensor or numpy array, assign the value of the internal state of the GRU layer to this argument.The function should then return the logits as a 2D numpy array, where the first dimension is equal to 1 (batch size).
Hint: the internal state of the recurrent can be reset to zeros using the reset_states method.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def get_logits(model, token_sequence, initial_state = None):
"""
This function takes a model object, a token sequence and an optional initial
state for the recurrent layer. The function should return the logits prediction
for the final time step as a 2D numpy array.
"""
model.layers[1].reset_states(states = initial_state)
return model.predict(tf.constant(token_sequence))[np.newaxis, 0, -1]
# Test the get_logits function by passing a dummy token sequence
dummy_initial_state = tf.random.normal(model.layers[1].states[0].shape)
get_logits(model, [[1, 2, 3, 4]], initial_state = dummy_initial_state)
array([[-6.7584934 , 1.6704779 , 2.1975632 , 0.78530157, 2.5938761 ,
-1.2590218 , -6.7742476 , 1.9836363 , 7.7638574 , 2.8425632 ,
3.2998426 , 1.5371597 , 4.812756 , 0.1021549 , 4.9321613 ,
6.5411215 , -1.9765807 , 0.9653599 , 2.7875838 , -0.78107035,
2.3413863 , -0.69934464, -9.423188 , -0.3905293 , 2.8863978 ,
-0.88447905, -6.1078224 , -1.4959075 , -0.38982463, -4.0458646 ,
1.596179 , -6.0312667 , -6.32223 , -4.635657 , -1.6404712 ,
-4.5672054 , -5.619035 , -5.3515863 , -2.8870163 , -2.4968946 ,
-6.037463 , -6.256722 , -5.4220247 , -1.795031 , -4.7974024 ,
-5.2217336 , -1.156556 , -5.8715396 , -1.0469037 , -6.931812 ,
-4.9046717 , -4.931508 , -7.1691175 , -6.3875237 , -5.034652 ,
0.2496903 , -1.1603751 , -1.4467968 , -5.7554555 , -5.2213883 ,
-2.3573935 , -1.0116509 , -4.4044657 , -5.247178 , -3.4933565 ]],
dtype=float32)
You should now write a function that takes a logits prediction similar to the above, uses it to create a categorical distribution, and samples a token from this distribution. The following function takes a 2D numpy array logits as an argument, and should return a single integer prediction that is sampled from the categorical distribution.
Hint: you might find the tf.random.categorical function useful for this; see the documentation here.
#### GRADED CELL ####
# Complete the following function.
# Make sure not to change the function name or arguments.
def sample_token(logits):
"""
This function takes a 2D numpy array as an input, and constructs a
categorical distribution using it. It should then sample from this
distribution and return the sample as a single integer.
"""
return tf.random.categorical(logits = logits, num_samples = 1).numpy().flatten()[0]
# Test the sample_token function by passing dummy logits
dummy_initial_state = tf.random.normal(model.layers[1].states[0].shape)
dummy_logits = get_logits(model, [[1, 2, 3, 4]], initial_state = dummy_initial_state)
sample_token(dummy_logits)
15
logits_size = dummy_logits.shape[1]
dummy_logits = -np.inf * np.ones((1, logits_size))
dummy_logits[0, 20] = 0
sample_token(dummy_logits)
random_inx = np.random.choice(logits_size, 2, replace = False)
random_inx1, random_inx2 = random_inx[0], random_inx[1]
print(random_inx1, random_inx2)
dummy_logits = -np.inf * np.ones((1, logits_size))
dummy_logits[0, random_inx1] = 0
dummy_logits[0, random_inx2] = 0
sampled_token = []
for _ in range(100):
sampled_token.append(sample_token(dummy_logits))
l_tokens, l_counts = np.unique(np.array(sampled_token), return_counts = True)
len(l_tokens) == 2
18 6
True
You are now ready to generate text from the model!
# Create a seed string and number of generation steps
init_string = 'ROMEO:'
num_generation_steps = 1000
# Use the model to generate a token sequence
token_sequence = tokenizer.texts_to_sequences([init_string])
initial_state = None
input_sequence = token_sequence
for _ in range(num_generation_steps):
logits = get_logits(model, input_sequence, initial_state=initial_state)
sampled_token = sample_token(logits)
token_sequence[0].append(sampled_token)
input_sequence = [[sampled_token]]
initial_state = model.layers[1].states[0].numpy()
print(tokenizer.sequences_to_texts(token_sequence)[0][::2])
ROMEO: My delicate the valiant will that one Thou art alience; that takes the grave is ble; Out of passion return the case, against whose heavy death, A happy teirs that in my back hath made be the sight And then minute this grace! O my! This seen, It cannot be a plainful wife all brawn More man in shemelss for looks fear; No, as af A genly mean to any thine; For heaven comilius, as I think, shall kindly there; But they stand not the most shame of hand, my instruct, By any man the sknute's woo Angelo, Death are all all that gives change can twound Horth, Say into the words: there's not your actions, I'll bett thee for my pamp, so hidselves are: And here shall way this for fault that is loved: The lew the season at the ben-time loat, That a most been by upon the Capulett, that He dies twis veil Men from the king's a gost; Great Paris; There, if a fair quiet canvey Are bacrised, a gies, there, father to look, Proclaim thee but lie, mean to might the geor-man, For counsel in the duke's ten take
Congratulations for completing this programming assignment! In the next week of the course we will see how to build customised models and layers, and make custom training loops.
In this notebook, you will use the model subclassing API together with custom layers to create a residual network architecture. You will then train your custom model on the Fashion-MNIST dataset by using a custom training loop and implementing the automatic differentiation tools in Tensorflow to calculate the gradients for backpropagation.
Some code cells are provided you in the notebook. You should avoid editing provided code, and make sure to execute the cells in order to avoid unexpected errors. Some cells begin with the line:
#### GRADED CELL ####
Don't move or edit this first line - this is what the automatic grader looks for to recognise graded cells. These cells require you to write your own code to complete them, and are automatically graded when you submit the notebook. Don't edit the function name or signature provided in these cells, otherwise the automatic grader might not function properly. Inside these graded cells, you can use any functions or classes that are imported below, but make sure you don't use any variables that are outside the scope of the function.
Complete all the tasks you are asked for in the worksheet. When you have finished and are happy with your code, press the Submit Assignment button at the top of this notebook.
We'll start running some imports, and loading the dataset. Do not edit the existing imports in the following cell. If you would like to make further Tensorflow imports, you should add them here.
#### PACKAGE IMPORTS ####
# Run this cell first to import all required packages. Do not make any imports elsewhere in the notebook
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Layer, BatchNormalization, Conv2D, Dense, Flatten, Add
import numpy as np
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt
# If you would like to make further imports from tensorflow, add them here

In this assignment, you will use the Fashion-MNIST dataset. It consists of a training set of 60,000 images of fashion items with corresponding labels, and a test set of 10,000 images. The images have been normalised and centred. The dataset is frequently used in machine learning research, especially as a drop-in replacement for the MNIST dataset.
Your goal is to construct a ResNet model that classifies images of fashion items into one of 10 classes.
For this programming assignment, we will take a smaller sample of the dataset to reduce the training time.
# Load and preprocess the Fashion-MNIST dataset
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
train_images = train_images.astype(np.float32)
test_images = test_images.astype(np.float32)
train_images = train_images[:5000] / 255.
train_labels = train_labels[:5000]
test_images = test_images / 255.
train_images = train_images[..., np.newaxis]
test_images = test_images[..., np.newaxis]
# Create Dataset objects for the training and test sets
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels))
train_dataset = train_dataset.batch(32)
test_dataset = tf.data.Dataset.from_tensor_slices((test_images, test_labels))
test_dataset = test_dataset.batch(32)
# Get dataset labels
image_labels = [
'T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt',
'Sneaker', 'Bag', 'Ankle boot'
]
You should now create a first custom layer for a residual block of your network. Using layer subclassing, build your custom layer according to the following spec:
__init__, build and call methods. The __init__ method has been completed for you. It calls the base Layer class initializer, passing on any keyword argumentsbuild method should create the layers. It will take an input_shape argument, and should extract the number of filters from this argument. It should create:input shape keyword argument'SAME' padding, and no activation function'SAME' padding, and no activation functioncall method should then process the input through the layers:training keyword argumenttf.nn.relu activation functiontraining keyword argumenttf.nn.relu activation function#### GRADED CELL ####
# Complete the following class.
# Make sure to not change the class or method names or arguments.
class ResidualBlock(Layer):
def __init__(self, **kwargs):
super(ResidualBlock, self).__init__(**kwargs)
def build(self, input_shape):
"""
This method should build the layers according to the above specification. Make sure
to use the input_shape argument to get the correct number of filters, and to set the
input_shape of the first layer in the block.
"""
self.bn_1 = BatchNormalization(input_shape = input_shape)
self.conv_1 = Conv2D(filters = input_shape[-1], kernel_size = (3, 3), padding = 'same')
self.bn_2 = BatchNormalization()
self.conv_2 = Conv2D(filters = input_shape[-1], kernel_size = (3, 3), padding = 'same')
def call(self, inputs, training = False):
"""
This method should contain the code for calling the layer according to the above
specification, using the layer objects set up in the build method.
"""
h = self.bn_1(inputs, training = training)
h = tf.nn.relu(h)
h = self.conv_1(h)
h = self.bn_2(h, training = training)
h = tf.nn.relu(h)
h = self.conv_2(h)
return Add()([inputs, h])
# Test your custom layer - the following should create a model using your layer
test_model = tf.keras.Sequential([ResidualBlock(input_shape = (28, 28, 1), name = "residual_block")])
test_model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= residual_block (ResidualBloc (None, 28, 28, 1) 28 ================================================================= Total params: 28 Trainable params: 24 Non-trainable params: 4 _________________________________________________________________
You should now create a second custom layer for a residual block of your network. This layer will be used to change the number of filters within the block. Using layer subclassing, build your custom layer according to the following spec:
__init__, build and call methods Layer class initializer, passing on any keyword arguments. It should also accept a out_filters argument, and save it as a class attributebuild method should create the layers. It will take an input_shape argument, and should extract the number of input filters from this argument. It should create:input shape keyword argument"SAME" padding, and no activation functionout_filters number of filters, a 3x3 kernel size, "SAME" padding, and no activation functionout_filters number of filters, a 1x1 kernel size, and no activation functioncall method should then process the input through the layers:training keyword argumenttf.nn.relu activation functiontraining keyword argumenttf.nn.relu activation function#### GRADED CELL ####
# Complete the following class.
# Make sure to not change the class or method names or arguments.
class FiltersChangeResidualBlock(Layer):
def __init__(self, out_filters, **kwargs):
"""
The class initialiser should call the base class initialiser, passing any keyword
arguments along. It should also set the number of filters as a class attribute.
"""
super(FiltersChangeResidualBlock, self).__init__(**kwargs)
self.out_filters = out_filters
def build(self, input_shape):
"""
This method should build the layers according to the above specification. Make sure
to use the input_shape argument to get the correct number of filters, and to set the
input_shape of the first layer in the block.
"""
self.bn_1 = BatchNormalization(input_shape = input_shape)
self.conv_1 = Conv2D(filters = input_shape[-1], kernel_size = (3, 3), padding = 'same')
self.bn_2 = BatchNormalization()
self.conv_2 = Conv2D(filters = self.out_filters, kernel_size = (3, 3), padding = 'same')
self.conv_3 = Conv2D(filters = self.out_filters, kernel_size = (1, 1))
def call(self, inputs, training = False):
"""
This method should contain the code for calling the layer according to the above
specification, using the layer objects set up in the build method.
"""
h = self.bn_1(inputs, training = training)
h = tf.nn.relu(h)
h = self.conv_1(h)
h = self.bn_2(h, training = training)
h = tf.nn.relu(h)
h = self.conv_2(h)
return Add()([self.conv_3(inputs), h])
# Test your custom layer - the following should create a model using your layer
test_model = tf.keras.Sequential ([
FiltersChangeResidualBlock(16, input_shape = (32, 32, 3), name = "fc_resnet_block")
])
test_model.summary()
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= fc_resnet_block (FiltersChan (None, 32, 32, 16) 620 ================================================================= Total params: 620 Trainable params: 608 Non-trainable params: 12 _________________________________________________________________
You are now ready to build your ResNet model. Using model subclassing, build your model according to the following spec:
__init__ and call methods. Model class initializer, passing on any keyword arguments. It should create the model layers:ResidualBlock layer.FiltersChangeResidualBlock layer, with 64 output filters.call method should then process the input through the layers in the order given above. Ensure to pass the training keyword argument to the residual blocks, to ensure the correct mode of operation for the batch norm layers.In total, your neural network should have six layers (counting each residual block as one layer).
#### GRADED CELL ####
# Complete the following class.
# Make sure to not change the class or method names or arguments.
class ResNetModel(Model):
def __init__(self, **kwargs):
"""
The class initialiser should call the base class initialiser, passing any keyword
arguments along. It should also create the layers of the network according to the
above specification.
"""
super(ResNetModel, self).__init__(**kwargs)
self.conv_1 = Conv2D(filters = 32, kernel_size = (7, 7), strides = 2)
self.residual = ResidualBlock()
self.conv_2 = Conv2D(filters = 32, kernel_size = (3, 3), strides = 2)
self.filters_change_residual = FiltersChangeResidualBlock(out_filters = 64)
self.flatten = Flatten()
self.dense = Dense(units = 10, activation = 'softmax')
def call(self, inputs, training = False):
"""
This method should contain the code for calling the layer according to the above
specification, using the layer objects set up in the initialiser.
"""
h = self.conv_1(inputs)
h = self.residual(h, training = training)
h = self.conv_2(h)
h = self.filters_change_residual(h, training = training)
h = self.flatten(h)
return self.dense(h)
# Create the model
resnet_model = ResNetModel()
We will use the Adam optimizer with a learning rate of 0.001, and the sparse categorical cross entropy function.
# Create the optimizer and loss
optimizer_obj = tf.keras.optimizers.Adam(learning_rate = 0.001)
loss_obj = tf.keras.losses.SparseCategoricalCrossentropy()
You should now create the grad function that will compute the forward and backward pass, and return the loss value and gradients that will be used in your custom training loop:
grad function takes a model instance, inputs, targets and the loss object above as argumentstf.GradientTape context to compute the forward pass and calculate the loss#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
@tf.function
def grad(model, inputs, targets, loss):
"""
This function should compute the loss and gradients of your model, corresponding to
the inputs and targets provided. It should return the loss and gradients.
"""
with tf.GradientTape() as tape:
loss_value = loss(y_true = targets, y_pred = model(inputs))
return loss_value, tape.gradient(loss_value, model.trainable_variables)
You should now write a custom training loop. Complete the following function, according to the spec:
model: an instance of your custom modelnum_epochs: integer number of epochs to train the modeldataset: a tf.data.Dataset object for the training dataoptimizer: an optimizer object, as created aboveloss: a sparse categorical cross entropy object, as created abovegrad_fn: your grad function above, that returns the loss and gradients for given model, inputs and targetsgrad_fn to compute gradients for each training batch, and updating the model parameters using optimizer.apply_gradients. You may also want to print out the loss and accuracy at each epoch during the training.
#### GRADED CELL ####
# Complete the following function.
# Make sure to not change the function name or arguments.
def train_resnet(model, num_epochs, dataset, optimizer, loss, grad_fn):
"""
This function should implement the custom training loop, as described above. It should
return a tuple of two elements: the first element is a list of loss values per epoch, the
second is a list of accuracy values per epoch
"""
train_loss_results = []
train_accuracy_results = []
for epoch in range(num_epochs):
epoch_loss_avg = tf.keras.metrics.Mean()
epoch_accuracy = tf.keras.metrics.SparseCategoricalAccuracy()
for x, y in dataset:
loss_value, grads = grad_fn(model, x, y, loss)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
epoch_loss_avg(loss_value)
epoch_accuracy(y_true = y, y_pred = model(x))
train_loss_results.append(epoch_loss_avg.result())
train_accuracy_results.append(epoch_accuracy.result())
return train_loss_results, train_accuracy_results
# Train the model for 8 epochs
train_loss_results, train_accuracy_results = train_resnet (
resnet_model, 8, train_dataset, optimizer_obj, loss_obj, grad
)
fig, axes = plt.subplots(1, 2, sharex = True, figsize = (12, 5))
axes[0].set_xlabel('Epochs', fontsize = 14)
axes[0].set_ylabel('Loss', fontsize = 14)
axes[0].set_title('Loss vs epochs')
axes[0].plot(train_loss_results)
axes[1].set_title('Accuracy vs epochs')
axes[1].set_ylabel('Accuracy', fontsize = 14)
axes[1].set_xlabel('Epochs', fontsize = 14)
axes[1].plot(train_accuracy_results)
plt.show()
# Compute the test loss and accuracy
epoch_loss_avg = tf.keras.metrics.Mean()
epoch_accuracy = tf.keras.metrics.CategoricalAccuracy()
for x, y in test_dataset:
model_output = resnet_model(x)
epoch_loss_avg(loss_obj(y, model_output))
epoch_accuracy(to_categorical(y), model_output)
print("Test loss: {:.3f}".format(epoch_loss_avg.result().numpy()))
print("Test accuracy: {:.3%}".format(epoch_accuracy.result().numpy()))
Test loss: 0.526 Test accuracy: 83.790%
Let's see some model predictions! We will randomly select four images from the test data, and display the image and label for each.
For each test image, model's prediction (the label with maximum probability) is shown, together with a plot showing the model's categorical distribution.
# Run this cell to get model predictions on randomly selected test images
num_test_images = test_images.shape[0]
random_inx = np.random.choice(test_images.shape[0], 4)
random_test_images = test_images[random_inx, ...]
random_test_labels = test_labels[random_inx, ...]
predictions = resnet_model(random_test_images)
fig, axes = plt.subplots(4, 2, figsize = (16, 12))
fig.subplots_adjust(hspace = 0.5, wspace = -0.2)
for i, (prediction, image, label) in enumerate(zip(predictions, random_test_images, random_test_labels)):
axes[i, 0].imshow(np.squeeze(image))
axes[i, 0].get_xaxis().set_visible(False)
axes[i, 0].get_yaxis().set_visible(False)
axes[i, 0].text(5., -2., f'Class {label} ({image_labels[label]})')
axes[i, 1].bar(np.arange(len(prediction)), prediction)
axes[i, 1].set_xticks(np.arange(len(prediction)))
axes[i, 1].set_xticklabels(image_labels, rotation = 0)
pred_inx = np.argmax(prediction)
axes[i, 1].set_title(f"Categorical distribution. Model prediction: {image_labels[pred_inx]}")
plt.show()
Congratulations for completing this programming assignment! You're now ready to move on to the capstone project for this course.